Skip to content

Implement Fisher Yates algorithm#172

Merged
mchav merged 3 commits intoDataHaskell:mainfrom
kayvank:170/implement-Fisher-Yates-algorithm
Mar 2, 2026
Merged

Implement Fisher Yates algorithm#172
mchav merged 3 commits intoDataHaskell:mainfrom
kayvank:170/implement-Fisher-Yates-algorithm

Conversation

@kayvank
Copy link
Contributor

@kayvank kayvank commented Mar 1, 2026

No description provided.

@kayvank kayvank marked this pull request as draft March 1, 2026 01:13
@kayvank kayvank force-pushed the 170/implement-Fisher-Yates-algorithm branch from 402f667 to 7eaf4e7 Compare March 1, 2026 01:14
@daikonradish
Copy link
Contributor

Thanks for submitting a PR! I'll have a good look at it later but if you're open to it, you could look at a one-time test for randomness here:

https://cnut1648.github.io/files/posts/Test_for_rand.pdf

Basically, take the indices that are output by shuffleVec and take a look at the distribution. But that's might be overkill.

where
shuffleVec :: (RandomGen g) => g -> VU.Vector Int -> VU.Vector Int
shuffleVec g v = runST $ do
vm <- VU.thaw v
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of declaring the vector from list just to thaw it we can just create a new one in the shuffle vec function.

@kayvank kayvank force-pushed the 170/implement-Fisher-Yates-algorithm branch from fc6f38a to cc83918 Compare March 1, 2026 17:57
Issue 170, implement PR comments and clean up build warnings
@kayvank kayvank force-pushed the 170/implement-Fisher-Yates-algorithm branch from cc83918 to 3e69bd5 Compare March 1, 2026 17:58
@mchav
Copy link
Member

mchav commented Mar 1, 2026

@kayvank this looks good. Rustin on some simple tests. Mostly that indices aren't dropped or duplicated and that it doesn't fail on empty.

shuffledIndices :: (RandomGen g) => g -> Int -> VU.Vector Int
shuffledIndices pureGen k = VU.fromList (shuffle' [0 .. (k - 1)] k pureGen)
shuffledIndices pureGen k
| k <= 0 = VU.empty
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We return empty vector even when k is a negative number, which does not seen correct.
Should we error inf the rare event that k < 0? @mchav

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine since the number is derived from the size of the dataframe. And the shuffle of an empty dataframe is an empty dataframe.

@kayvank kayvank force-pushed the 170/implement-Fisher-Yates-algorithm branch from ff07a4e to 0b00734 Compare March 2, 2026 06:49
@kayvank kayvank marked this pull request as ready for review March 2, 2026 06:49
@kayvank
Copy link
Contributor Author

kayvank commented Mar 2, 2026

@kayvank this looks good. Rustin on some simple tests. Mostly that indices aren't dropped or duplicated and that it doesn't fail on empty.

Added two new unit tests.

@mchav mchav merged commit c4ae5f8 into DataHaskell:main Mar 2, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants