Как перемешать элементы массива python
Перейти к содержимому

Как перемешать элементы массива python

  • автор:

Python: Shuffle a List (Randomize Python List Elements)

Python Shuffle a List Cover Image

In this tutorial, you’ll learn how to use Python to shuffle a list, thereby randomizing Python list elements. For this, you will learn how to use the Python random library, in particular the .shuffle() and .random() methods.

Knowing how to shuffle a list and produce a random result is an incredibly helpful skill. For example, it can be incredibly helpful in developing a Python game where you need to choose a random result. It can also have immensely helpful applications in data-related work, where you may need to pull random results.

The Quick Answer: Use random.shuffle()

Quick Answer - Shuffle a Python List

Table of Contents

What is the Difference Between .shuffle and .sample?

Python comes built-in with an incredibly helpful library to generate randomness, called random . Throughout this tutorial, you’ll learn how to use the random.shuffle() and random.sample() functions. Before we dive into how to use them, however, let’s quickly explore what the differences are.

Both functions return a list that is randomly sorted, but how they return them is different:

  • random.shuffle() shuffles the original list, meaning the shuffling can be done in-place
  • random.sample() returns a new shuffled list, based on the original list

random.sample() can also be used to shuffle strings and tuples, as it creates a new list, thereby allowing you to work on immutable data types.

Now, let’s dive into how to shuffle a list in Python!

Shuffle a Python List and Re-assign It to Itself

The random.shuffle() function makes it easy to shuffle a list’s items in Python. Because the function works in-place, we do not need to reassign the list to itself, but it allows us to easily randomize list elements.

Let’s take a look at what this looks like:

What we’ve done here is:

  1. Create a new list
  2. Applied the random.shuffle() function to it
  3. Printed the result to verify the shuffling

Keep in mind, if you’re following along with the example above, your randomly sorted list will probably look different!

In the next section, you’ll learn how to use the random.sample() function to randomize a list in Python.

Want to learn more about Python list comprehensions? Check out this in-depth tutorial that covers off everything you need to know, with hands-on examples. More of a visual learner, check out my YouTube tutorial here.

Shuffle a Python List and Assign It to a New List

The random.sample() function is used to sample a set number of items from a sequence-like object in Python. The function picks these items randomly.

Let’s take a quick look at what the function looks like:

In this case, the iterable will be the list we want to shuffle, and k refers to the number of items we want to select. Because we want to return the full list in a random order, we will pass in the length of the list into the k parameter.

Let’s take a look at how we can use the .sample() function to randomize a Python list:

Let’s take a look at how we’ve managed to randomize our Python list elements:

  1. We generated our list and assigned it to a_list
  2. We generated a new variable shuffled which took the random.sample() function. We passed the list and the length of our list into the function. By using the len() function, we are able to keep this method dynamic, as the length of the list may change.

In the next section, you’ll learn how to reproduce a shuffled list result in Python.

Want to learn how to pretty print a JSON file using Python? Learn three different methods to accomplish this using this in-depth tutorial here.

Reproduce a Shuffled Python List Result

When working with random results, there may be times when you want to be able to reproduce a result. In this example below, you’ll learn how to be able to reproduce a shuffled list.

We will use the random.seed() function to generate a result that is reproducible.

Let’s take a look at what this looks like:

Now, it may look like the resulting printed lists aren’t random. However, if we were to re-run our program above, the program would return the same randomly shuffled lists each time! The random.seed() function allows us generate a base value that defines the pseudo-randomness of the functions that follow it. Because, in this case, we assigned it a specific value of 2 , we are able to reproduce the randomness.

In the next section, you’ll learn how to shuffle a Python list of lists.

Want to learn more about Python for-loops? Check out my in-depth tutorial that takes your from beginner to advanced for-loops user! Want to watch a video instead? Check out my YouTube tutorial here.

Shuffle a Python List of Lists

In Python, you’ll often encounter multi-dimensional lists, often referred to as lists of lists. We can easily do this using a for loop. By looping over each list in the list of lists, we can then easily apply the random.shuffle() function to randomize each sublist’s elements.

Let’s take a look at what this looks like:

While we could also accomplish this using a list comprehension, the syntax of not re-assigning the list comprehension is a bit awkward and not as intuitive. For this reason, we’ve opted to use a for loop here as we should always be striving for readability.

In the next section, you’ll learn how to shuffle multiple lists with the same order of shuffling.

Want to learn how to use the Python zip() function to iterate over two lists? This tutorial teaches you exactly what the zip() function does and shows you some creative ways to use the function.

Shuffle Multiple Lists with the Same Order of Shuffling

Let’s say you have two lists: one that contains the type of fruit and the other the number of that type of fruit you have. You want to shuffle the lists but you want the referential integrity to remain true (meaning that index 0 of both lists would be shuffled to the same index in the shuffled result).

In order to accomplish this, we’ll:

  1. Merge the two lists in a list of lists using the zip() function
  2. Shuffle the list of lists internally
  3. Unpack the list of lists into individual lists

Let’s take a look at how we can do this:

We can see that we’ve made good use of both the zip() function as well as Python list comprehensions to make this happen.

Need to automate renaming files? Check out this in-depth guide on using pathlib to rename files. More of a visual learner, the entire tutorial is also available as a video in the post!

Conclusion

In this tutorial, you learned how to use Python to randomly shuffle a list, thereby sorting its items in a random order. For this, you learned how to use the Python random library, in particular the .shuffle() and .random() methods.

To learn more about the random library, check out the official documentation here.

Shuffle an array with python, randomize array item order with python

What’s the easiest way to shuffle an array with python?

Machavity's user avatar

11 Answers 11

David Z's user avatar

Alternative way to do this using sklearn

Advantage: You can random multiple arrays simultaneously without disrupting the mapping. And ‘random_state’ can control the shuffling for reproducible behavior.

Qy Zuo's user avatar

Just in case you want a new array you can use sample :

Federico klez Culloca's user avatar

Charlie Parker's user avatar

The other answers are the easiest, however it’s a bit annoying that the random.shuffle method doesn’t actually return anything — it just sorts the given list. If you want to chain calls or just be able to declare a shuffled array in one line you can do:

Then you can do lines like:

When dealing with regular Python lists, random.shuffle() will do the job just as the previous answers show.

But when it come to ndarray ( numpy.array ), random.shuffle seems to break the original ndarray . Here is an example:

Just use: np.random.shuffle(a)

Like random.shuffle , np.random.shuffle shuffles the array in-place.

You can sort your array with random key

key only be read once so comparing item during sort still efficient.

but look like random.shuffle(array) will be faster since it written in C

this is O(Nlog(N)) btw

James's user avatar

In addition to the previous replies, I would like to introduce another function.

numpy.random.shuffle as well as random.shuffle perform in-place shuffling. However, if you want to return a shuffled array numpy.random.permutation is the function to use.

Saber's user avatar

I don’t know I used random.shuffle() but it return ‘None’ to me, so I wrote this, might helpful to someone

Be aware that random.shuffle() should not be used on multi-dimensional arrays as it causes repetitions.

Imagine you want to shuffle an array along its first dimension, we can create the following test example,

so that along the first axis, the i-th element corresponds to a 2×3 matrix where all the elements are equal to i.

Python Random shuffle() Method

Shuffle a list (reorganize the order of the list items):

Definition and Usage

The shuffle() method takes a sequence, like a list, and reorganize the order of the items.

Note: This method changes the original list, it does not return a new list.

Syntax

Parameter Values

Parameter Description
sequence Required. A sequence.
function Deprecated since Python 3.9. Removed in Python 3.11.
Optional. The name of a function that returns a number between 0.0 and 1.0.
If not specified, the function random() will be used

More Examples

Example

This example uses the function parameter, which is deprecated since Python 3.9 and removed in Python 3.11.

You can define your own function to weigh or specify the result.

If the function returns the same number each time, the result will be in the same order each time:

random — Generate pseudo-random numbers¶

This module implements pseudo-random number generators for various distributions.

For integers, there is uniform selection from a range. For sequences, there is uniform selection of a random element, a function to generate a random permutation of a list in-place, and a function for random sampling without replacement.

On the real line, there are functions to compute uniform, normal (Gaussian), lognormal, negative exponential, gamma, and beta distributions. For generating distributions of angles, the von Mises distribution is available.

Almost all module functions depend on the basic function random() , which generates a random float uniformly in the half-open range 0.0 <= X < 1.0 . Python uses the Mersenne Twister as the core generator. It produces 53-bit precision floats and has a period of 2**19937-1. The underlying implementation in C is both fast and threadsafe. The Mersenne Twister is one of the most extensively tested random number generators in existence. However, being completely deterministic, it is not suitable for all purposes, and is completely unsuitable for cryptographic purposes.

The functions supplied by this module are actually bound methods of a hidden instance of the random.Random class. You can instantiate your own instances of Random to get generators that don’t share state.

Class Random can also be subclassed if you want to use a different basic generator of your own devising: in that case, override the random() , seed() , getstate() , and setstate() methods. Optionally, a new generator can supply a getrandbits() method — this allows randrange() to produce selections over an arbitrarily large range.

The random module also provides the SystemRandom class which uses the system function os.urandom() to generate random numbers from sources provided by the operating system.

The pseudo-random generators of this module should not be used for security purposes. For security or cryptographic uses, see the secrets module.

M. Matsumoto and T. Nishimura, “Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator”, ACM Transactions on Modeling and Computer Simulation Vol. 8, No. 1, January pp.3–30 1998.

Complementary-Multiply-with-Carry recipe for a compatible alternative random number generator with a long period and comparatively simple update operations.

Bookkeeping functions¶

Initialize the random number generator.

If a is omitted or None , the current system time is used. If randomness sources are provided by the operating system, they are used instead of the system time (see the os.urandom() function for details on availability).

If a is an int, it is used directly.

With version 2 (the default), a str , bytes , or bytearray object gets converted to an int and all of its bits are used.

With version 1 (provided for reproducing random sequences from older versions of Python), the algorithm for str and bytes generates a narrower range of seeds.

Changed in version 3.2: Moved to the version 2 scheme which uses all of the bits in a string seed.

Changed in version 3.11: The seed must be one of the following types: NoneType, int , float , str , bytes , or bytearray .

Return an object capturing the current internal state of the generator. This object can be passed to setstate() to restore the state.

random. setstate ( state ) ¶

state should have been obtained from a previous call to getstate() , and setstate() restores the internal state of the generator to what it was at the time getstate() was called.

Functions for bytes¶

Generate n random bytes.

This method should not be used for generating security tokens. Use secrets.token_bytes() instead.

New in version 3.9.

Functions for integers¶

Return a randomly selected element from range(start, stop, step) . This is equivalent to choice(range(start, stop, step)) , but doesn’t actually build a range object.

The positional argument pattern matches that of range() . Keyword arguments should not be used because the function may use them in unexpected ways.

Changed in version 3.2: randrange() is more sophisticated about producing equally distributed values. Formerly it used a style like int(random()*n) which could produce slightly uneven distributions.

Deprecated since version 3.10: The automatic conversion of non-integer types to equivalent integers is deprecated. Currently randrange(10.0) is losslessly converted to randrange(10) . In the future, this will raise a TypeError .

Deprecated since version 3.10: The exception raised for non-integral values such as randrange(10.5) or randrange(’10’) will be changed from ValueError to TypeError .

Return a random integer N such that a <= N <= b . Alias for randrange(a, b+1) .

random. getrandbits ( k ) ¶

Returns a non-negative Python integer with k random bits. This method is supplied with the MersenneTwister generator and some other generators may also provide it as an optional part of the API. When available, getrandbits() enables randrange() to handle arbitrarily large ranges.

Changed in version 3.9: This method now accepts zero for k.

Functions for sequences¶

Return a random element from the non-empty sequence seq. If seq is empty, raises IndexError .

random. choices ( population , weights = None , * , cum_weights = None , k = 1 ) ¶

Return a k sized list of elements chosen from the population with replacement. If the population is empty, raises IndexError .

If a weights sequence is specified, selections are made according to the relative weights. Alternatively, if a cum_weights sequence is given, the selections are made according to the cumulative weights (perhaps computed using itertools.accumulate() ). For example, the relative weights [10, 5, 30, 5] are equivalent to the cumulative weights [10, 15, 45, 50] . Internally, the relative weights are converted to cumulative weights before making selections, so supplying the cumulative weights saves work.

If neither weights nor cum_weights are specified, selections are made with equal probability. If a weights sequence is supplied, it must be the same length as the population sequence. It is a TypeError to specify both weights and cum_weights.

The weights or cum_weights can use any numeric type that interoperates with the float values returned by random() (that includes integers, floats, and fractions but excludes decimals). Weights are assumed to be non-negative and finite. A ValueError is raised if all weights are zero.

For a given seed, the choices() function with equal weighting typically produces a different sequence than repeated calls to choice() . The algorithm used by choices() uses floating point arithmetic for internal consistency and speed. The algorithm used by choice() defaults to integer arithmetic with repeated selections to avoid small biases from round-off error.

New in version 3.6.

Changed in version 3.9: Raises a ValueError if all weights are zero.

Shuffle the sequence x in place.

To shuffle an immutable sequence and return a new shuffled list, use sample(x, k=len(x)) instead.

Note that even for small len(x) , the total number of permutations of x can quickly grow larger than the period of most random number generators. This implies that most permutations of a long sequence can never be generated. For example, a sequence of length 2080 is the largest that can fit within the period of the Mersenne Twister random number generator.

Deprecated since version 3.9, removed in version 3.11: The optional parameter random.

Return a k length list of unique elements chosen from the population sequence. Used for random sampling without replacement.

Returns a new list containing elements from the population while leaving the original population unchanged. The resulting list is in selection order so that all sub-slices will also be valid random samples. This allows raffle winners (the sample) to be partitioned into grand prize and second place winners (the subslices).

Members of the population need not be hashable or unique. If the population contains repeats, then each occurrence is a possible selection in the sample.

Repeated elements can be specified one at a time or with the optional keyword-only counts parameter. For example, sample([‘red’, ‘blue’], counts=[4, 2], k=5) is equivalent to sample([‘red’, ‘red’, ‘red’, ‘red’, ‘blue’, ‘blue’], k=5) .

To choose a sample from a range of integers, use a range() object as an argument. This is especially fast and space efficient for sampling from a large population: sample(range(10000000), k=60) .

If the sample size is larger than the population size, a ValueError is raised.

Changed in version 3.9: Added the counts parameter.

Changed in version 3.11: The population must be a sequence. Automatic conversion of sets to lists is no longer supported.

Real-valued distributions¶

The following functions generate specific real-valued distributions. Function parameters are named after the corresponding variables in the distribution’s equation, as used in common mathematical practice; most of these equations can be found in any statistics text.

Return the next random floating point number in the range 0.0 <= X < 1.0

Return a random floating point number N such that a <= N <= b for a <= b and b <= N <= a for b < a .

The end-point value b may or may not be included in the range depending on floating-point rounding in the equation a + (b-a) * random() .

random. triangular ( low , high , mode ) ¶

Return a random floating point number N such that low <= N <= high and with the specified mode between those bounds. The low and high bounds default to zero and one. The mode argument defaults to the midpoint between the bounds, giving a symmetric distribution.

random. betavariate ( alpha , beta ) ¶

Beta distribution. Conditions on the parameters are alpha > 0 and beta > 0 . Returned values range between 0 and 1.

random. expovariate ( lambd ) ¶

Exponential distribution. lambd is 1.0 divided by the desired mean. It should be nonzero. (The parameter would be called “lambda”, but that is a reserved word in Python.) Returned values range from 0 to positive infinity if lambd is positive, and from negative infinity to 0 if lambd is negative.

random. gammavariate ( alpha , beta ) ¶

Gamma distribution. (Not the gamma function!) The shape and scale parameters, alpha and beta, must have positive values. (Calling conventions vary and some sources define ‘beta’ as the inverse of the scale).

The probability distribution function is:

Normal distribution, also called the Gaussian distribution. mu is the mean, and sigma is the standard deviation. This is slightly faster than the normalvariate() function defined below.

Multithreading note: When two threads call this function simultaneously, it is possible that they will receive the same return value. This can be avoided in three ways. 1) Have each thread use a different instance of the random number generator. 2) Put locks around all calls. 3) Use the slower, but thread-safe normalvariate() function instead.

Changed in version 3.11: mu and sigma now have default arguments.

Log normal distribution. If you take the natural logarithm of this distribution, you’ll get a normal distribution with mean mu and standard deviation sigma. mu can have any value, and sigma must be greater than zero.

random. normalvariate ( mu = 0.0 , sigma = 1.0 ) ¶

Normal distribution. mu is the mean, and sigma is the standard deviation.

Changed in version 3.11: mu and sigma now have default arguments.

mu is the mean angle, expressed in radians between 0 and 2*pi, and kappa is the concentration parameter, which must be greater than or equal to zero. If kappa is equal to zero, this distribution reduces to a uniform random angle over the range 0 to 2*pi.

random. paretovariate ( alpha ) ¶

Pareto distribution. alpha is the shape parameter.

random. weibullvariate ( alpha , beta ) ¶

Weibull distribution. alpha is the scale parameter and beta is the shape parameter.

Alternative Generator¶

Class that implements the default pseudo-random number generator used by the random module.

Deprecated since version 3.9: In the future, the seed must be one of the following types: NoneType , int , float , str , bytes , or bytearray .

Class that uses the os.urandom() function for generating random numbers from sources provided by the operating system. Not available on all systems. Does not rely on software state, and sequences are not reproducible. Accordingly, the seed() method has no effect and is ignored. The getstate() and setstate() methods raise NotImplementedError if called.

Notes on Reproducibility¶

Sometimes it is useful to be able to reproduce the sequences given by a pseudo-random number generator. By re-using a seed value, the same sequence should be reproducible from run to run as long as multiple threads are not running.

Most of the random module’s algorithms and seeding functions are subject to change across Python versions, but two aspects are guaranteed not to change:

If a new seeding method is added, then a backward compatible seeder will be offered.

The generator’s random() method will continue to produce the same sequence when the compatible seeder is given the same seed.

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *