This function returns the mini-batches given the inputs and targets:

def iterate_minibatches(inputs, targets, batchsize, shuffle=False):
    assert inputs.shape[0] == targets.shape[0]
    if shuffle:
        indices = np.arange(inputs.shape[0])
        np.random.shuffle(indices)
    for start_idx in range(0, inputs.shape[0] - batchsize + 1, batchsize):
        if shuffle:
            excerpt = indices[start_idx:start_idx + batchsize]
        else:
            excerpt = slice(start_idx, start_idx + batchsize)
        yield inputs[excerpt], targets[excerpt]

and this tells you how to use that for training:

for n in xrange(n_epochs):
    for batch in iterate_minibatches(X, Y, batch_size, shuffle=True):
        x_batch, y_batch = batch
        l_train, acc_train = f_train(x_batch, y_batch)

    l_val, acc_val = f_val(Xt, Yt)
    logging.info('epoch ' + str(n) + ' ,train_loss ' + str(l_train) + ' ,acc ' + str(acc_train) + ' ,val_loss ' + str(l_val) + ' ,acc ' + str(acc_val))

Obviously you need to define the f_train, f_val and other functions yourself given the optimisation library (e.g. Lasagne, Keras) you are using.

Answer from Ash on Stack Overflow
🌐
Kenndanielso
kenndanielso.github.io › mlrefined › blog_posts › 13_Multilayer_perceptrons › 13_6_Stochastic_and_minibatch_gradient_descent.html
13.6 Stochastic and mini-batch gradient descent
Ideally we want all mini-batches to have the same size - a parameter we call the batch size - or be as equally-sized as possible when $J$ does not divide $P$. Notice, a batch size of $1$ turns mini-batch gradient descent into stochastic gradient descent, whereas a batch size of $P$ turns it into the standard or batch gradient descent. The code cell below contains Python implementation of the mini-batch gradient descent algorithm based on the standard gradient descent algorithm we saw previously in Chapter 6, where it is now slightly adjusted to take in the total number of data points as well as the size of each mini-batch via the input variables num_pts and batch_size, respectively.
Top answer
1 of 2
13

This function returns the mini-batches given the inputs and targets:

def iterate_minibatches(inputs, targets, batchsize, shuffle=False):
    assert inputs.shape[0] == targets.shape[0]
    if shuffle:
        indices = np.arange(inputs.shape[0])
        np.random.shuffle(indices)
    for start_idx in range(0, inputs.shape[0] - batchsize + 1, batchsize):
        if shuffle:
            excerpt = indices[start_idx:start_idx + batchsize]
        else:
            excerpt = slice(start_idx, start_idx + batchsize)
        yield inputs[excerpt], targets[excerpt]

and this tells you how to use that for training:

for n in xrange(n_epochs):
    for batch in iterate_minibatches(X, Y, batch_size, shuffle=True):
        x_batch, y_batch = batch
        l_train, acc_train = f_train(x_batch, y_batch)

    l_val, acc_val = f_val(Xt, Yt)
    logging.info('epoch ' + str(n) + ' ,train_loss ' + str(l_train) + ' ,acc ' + str(acc_train) + ' ,val_loss ' + str(l_val) + ' ,acc ' + str(acc_val))

Obviously you need to define the f_train, f_val and other functions yourself given the optimisation library (e.g. Lasagne, Keras) you are using.

2 of 2
6

The following function returns (yields) mini-batches. It is based on the function provided by Ash, but correctly handles the last minibatch.

def iterate_minibatches(inputs, targets, batchsize, shuffle=False):
    assert inputs.shape[0] == targets.shape[0]
    if shuffle:
        indices = np.arange(inputs.shape[0])
        np.random.shuffle(indices)
    for start_idx in range(0, inputs.shape[0], batchsize):
        end_idx = min(start_idx + batchsize, inputs.shape[0])
        if shuffle:
            excerpt = indices[start_idx:end_idx]
        else:
            excerpt = slice(start_idx, end_idx)
        yield inputs[excerpt], targets[excerpt]
🌐
Medium
medium.com › @lomashbhuva › mini-batch-gradient-descent-a-comprehensive-guide-ba27a6dc4863
Mini-Batch Gradient Descent: A Comprehensive Guide🌟🚀 | by Lomash Bhuva | Medium
February 25, 2025 - In this guide, we’ll explore mini-batch gradient descent, understand how it differs from other optimization techniques, and implement it in Python.
🌐
GeeksforGeeks
geeksforgeeks.org › machine learning › ml-mini-batch-gradient-descent-with-python
ML | Mini-Batch Gradient Descent with Python - GeeksforGeeks
July 5, 2025 - Each mini-batch is a tuple (X_mini, Y_mini) used for one update step in mini-batch gradient descent.
🌐
GeeksforGeeks
geeksforgeeks.org › deep learning › mini-batch-gradient-descent-in-deep-learning
Mini-Batch Gradient Descent in Deep Learning - GeeksforGeeks
September 30, 2025 - Instead of updating weights after calculating the error for each data point (in stochastic gradient descent) or after the entire dataset (in batch gradient descent), mini-batch gradient descent updates the model’s parameters after processing a mini-batch of data.
🌐
MachineLearningMastery
machinelearningmastery.com › home › blog › mini-batch gradient descent and dataloader in pytorch
Mini-Batch Gradient Descent and DataLoader in PyTorch - MachineLearningMastery.com
April 8, 2023 - Putting everything together, the following is a complete code to train the model, namely, w and b: Moving one step further, we’ll train our model with mini-batch gradient descent and DataLoader. We’ll set various batch sizes for training, i.e., batch sizes of 10 and 20.
🌐
Dive into Deep Learning
d2l.ai › chapter_optimization › minibatch-sgd.html
12.5. Minibatch Stochastic Gradient Descent — Dive into Deep Learning 1.0.3 documentation
Let’s see how optimization proceeds for batch gradient descent. This can be achieved by setting the minibatch size to 1500 (i.e., to the total number of examples). As a result the model parameters are updated only once per epoch. There is little progress.
🌐
CodeSignal
codesignal.com › learn › courses › gradient-descent-building-optimization-algorithms-from-scratch › lessons › optimizing-machine-learning-with-mini-batch-gradient-descent
Optimizing Machine Learning with Mini-Batch Gradient ...
Now, we'll delve into Python to implement MBGD. For this, we'll use numpy for numerical computations. The gradient_descent function carries out the Mini-Batch Gradient Descent: ... Be a part of our community of 1M+ users who develop and demonstrate their skills on CodeSignalStart learning today!
Find elsewhere
🌐
Medium
medium.com › @jaleeladejumo › gradient-descent-from-scratch-batch-gradient-descent-stochastic-gradient-descent-and-mini-batch-def681187473
Gradient Descent From Scratch- Batch Gradient Descent, Stochastic Gradient Descent, and Mini-Batch Gradient Descent. | by Jaleel Adejumo | Medium
April 12, 2023 - In this article, I will take you through the implementation of Batch Gradient Descent, Stochastic Gradient Descent, and Mini-Batch Gradient Descent coding from scratch in python. This will be beginners friendly. Understanding gradient descent method will help you in optimising your loss during ML model training.
🌐
Medium
medium.com › @zhaoyi0113 › python-implementation-of-batch-gradient-descent-379fa19eb428
Python implementation of batch gradient descent | by Joey Yi Zhao | Medium
July 26, 2023 - The parameter in this model is θ values. These are the values we need to minimise the cost function J(θ). One way to do this is to use batch gradient decent algorithm.
🌐
Bogotobogo
bogotobogo.com › python › python_numpy_batch_gradient_descent_algorithm.php
Python Tutorial: batch gradient descent algorithm - 2020
We get $\theta_0$ and $\theta_1$ as its output: import numpy as np import random import sklearn from sklearn.datasets.samples_generator import make_regression import pylab from scipy import stats def gradient_descent(alpha, x, y, ep=0.0001, max_iter=10000): converged = False iter = 0 m = x.shape[0] ...
🌐
GitHub
github.com › topics › mini-batch-gradient-descent
mini-batch-gradient-descent · GitHub Topics · GitHub
optimization optimizer hyperparameter-optimization momentum gradient-descent optimization-methods hyperparameter-tuning adagrad rmsprop stochastic-gradient-descent back-propagation adam-optimizer mini-batch-gradient-descent nadam ... Predicting House Price from Size and Number of Bedrooms using Multivariate Linear Regression in Python from scratch
🌐
MachineLearningMastery
machinelearningmastery.com › home › blog › a gentle introduction to mini-batch gradient descent and how to configure batch size
A Gentle Introduction to Mini-Batch Gradient Descent and How to Configure Batch Size - MachineLearningMastery.com
August 19, 2019 - That mini-batch gradient descent is the go-to method and how to configure it on your applications. Kick-start your project with my new book Deep Learning With Python, including step-by-step tutorials and the Python source code files for all examples.
🌐
Kaggle
kaggle.com › code › pratinavseth › regression-mini-batch-gradient-descent-scratch
Regression + Mini Batch Gradient Descent Scratch
Checking your browser before accessing www.kaggle.com · Click here if you are not automatically redirected after 5 seconds
🌐
GitHub
github.com › bhattbhavesh91 › gradient-descent-variants
GitHub - bhattbhavesh91/gradient-descent-variants: My implementation of Batch, Stochastic & Mini-Batch Gradient Descent Algorithm using Python
My implementation of Batch, Stochastic & Mini-Batch Gradient Descent Algorithm using Python - bhattbhavesh91/gradient-descent-variants
Starred by 21 users
Forked by 22 users
Languages   Jupyter Notebook 100.0% | Jupyter Notebook 100.0%
🌐
Spot Intelligence
spotintelligence.com › home › batch gradient descent in machine learning made simple & how to tutorial in python
Batch Gradient Descent In Machine Learning Made Simple & How To Tutorial In Python
May 22, 2024 - Below is the Python code for the batch gradient descent algorithm with a simple linear regression example for demonstration purposes.
🌐
The Land of Oz
ozzieliu.com › 2016 › 02 › 09 › gradient-descent-tutorial
Python Tutorial on Linear Regression with Batch Gradient Descent - The Land of Oz
February 10, 2016 - So we formally define a cost function ... liner regression line, we adjust our beta parameters to minimize: Again the hypothesis that we’re trying to find is given by the linear model: And we can use batch gradient descent where each iteration performs the update...
🌐
Insidelearningmachines
insidelearningmachines.com › home › what is mini-batch gradient descent? 3 pros and cons
What is Mini-Batch Gradient Descent? 3 Pros and Cons - Inside Learning Machines
May 1, 2024 - We can now put together a single Python function that will encompass the Mini-Batch Gradient Descent algorithm. This code will make use of the numpy package.
🌐
CodeSignal
codesignal.com › learn › courses › pytorch-techniques-for-model-optimization › lessons › model-training-with-mini-batches-in-pytorch
Model Training with Mini-Batches in PyTorch
Convergence: Provides a balance between noisy updates (SGD) and slow updates (full-batch), which can stabilize convergence. Regularization: Each mini-batch introduces some noise into the parameter updates, which can help prevent overfitting.