parallelStream uses the fork/join common pool which is an Executor, so you're right, it's almost the same.

The fork/join pool is used for all kinds of things so maybe some other, unrelated task was interfering. By declaring the Executor yourself, you are guaranteeing 4 dedicated threads.

forEach is a fine terminal operation for the first example. The one to avoid would be forEachOrdered, which breaks the parallelism.

Answer from Michael on Stack Overflow
Top answer
1 of 2
5

One and two use ForkJoinPool which is designed exactly for parallel processing of one task while ThreadPoolExecutor is used for concurrent processing of independent tasks. So One and Two are supposed to be faster.

2 of 2
3

When you use .filter(element -> f(element)).collect(Collectors.toList()), it will collect the matching elements into a List, whereas .collect(Collectors.partitioningBy(element -> f(element))) will collect all elements into either of two lists, followed by you dropping one of them and only retrieving the list of matches via .get(true).

It should be obvious that the second variant can only be on par with the first in the best case, i.e. if all elements match the predicate anyway or when the JVM’s optimizer is capable of removing the redundant work. In the worst lase, e.g. when no element matches, the second variant collects a list of all elements only to drop it afterwards, where the first variant would not collect any element.

The third variant is not comparable, as you didn’t show an actual implementation but just a sketch. There is no point in comparing a hypothetical implementation with an actual. The logic you describe, is the same as the logic of the parallel stream implementation. So you’re just reinventing the wheel. It may happen that you do something slightly better than the reference implementation or just better tailored to your specific task, but chances are much higher that you overlook things which the Stream API implementors did already consider during the development process which lasted several years.

So I wouldn’t place any bets on your third variant. If we add the time you will need to complete the third variant’s implementation, it can never be more efficient than just using either of the other variants.

So the first variant is the most efficient variant, especially as it is also the simplest, most readable, directly expressing your intent.

Discussions

Alternating between Java streams and parallel streams at runtime - Software Engineering Stack Exchange
Streams are kind of nice when you need to lay out a simple multi-step process in a little bit of code. But if all you are doing is using them to manage parallelism of tasks, the Executors and ExecutorService are more straightforward IMO. One thing I would avoid is pushing the number of threads ... More on softwareengineering.stackexchange.com
🌐 softwareengineering.stackexchange.com
multithreading - Task Executor vs Java 8 parallel streaming - Stack Overflow
3 Is ExecutorService the most light-weight approach to parallel execution in terms of overhead per thread? 2 Parallelization (Java 8) vs Concurrency (Java 7) 41 Difference between java 8 streams and parallel streams More on stackoverflow.com
🌐 stackoverflow.com
Fork/Join Framework vs. Parallel Streams vs. ExecutorService: The Ultimate Fork/Join Benchmark
The problem is, the file-processing benchmark is, unfortunately, really, really bad. For one, file processing method doesn't actually do anything. In fact -- if it weren't for some benign side effects that the JVM probably can't eliminate -- the entire processing code is completely dead, and could theoretically be skipped altogether by the JVM. And those side effects have nothing to do with actually processing the file. The file is re-opened, again and again, in each task, and then a simple IOStream is used to skip to the desired line. Not only is it dead code, it's dead code that does nothing in a really bad way (maybe the two cancel each other out). The way this should have been done is to map the file to a byte-buffer once, and then to actually process it and produce a result. But even then, you still couldn't do the processing with fork join/parallel streams. The bigger problem is that in the parallel streams benchmarks and the FJ benchmarks they are doing blocking IO in the tasks. You're not supposed to do that . What those two benchmarks are actually measuring is how well the FJ scheduler is able to overcome the absolute worst kind of abuse (and it does so quite well -- in fact, it's possible that the FJ pool actually spawns many more threads than they think are actually running), which might, or might not, be offset by the fact that the processing code is dead code. More on reddit.com
🌐 r/java
6
33
January 20, 2015
concurrency - Custom thread pool in Java 8 parallel stream - Stack Overflow
Is it possible to specify a custom thread pool for Java 8 parallel stream? I can not find it anywhere. Imagine that I have a server application and I would like to use parallel streams. But the More on stackoverflow.com
🌐 stackoverflow.com
🌐
Harness
harness.io › blog › service reliability management › fork/join framework vs. parallel streams vs. executorservice: the ultimate fork/join benchmark
Fork/Join Framework vs. Parallel Streams vs. ExecutorService
1 month ago - Syntactic sugar aside (lambdas! we didn’t mention lambdas), we’ve seen parallel streams perform better than the Fork/Join and the ExecutorService implementations. 6GB of text indexed in 24.33 seconds. You can trust Java here to deliver the best result.
🌐
DZone
dzone.com › coding › frameworks › fork/join framework vs. parallel streams vs. executorservice: the ultimate fork/join benchmark
Fork/Join Framework vs. Parallel Streams vs. ExecutorService: The Ultimate Fork/Join Benchmark
April 1, 2015 - Syntactic sugar aside (lambdas! we didn’t mention lambdas), we’ve seen parallel streams perform better than the Fork/Join and the ExecutorService implementations. 6GB of text indexed in 24.33 seconds. You can trust Java here to deliver the best result.
Top answer
1 of 3
5

You can define a custom thread pool by implementing the (Executor) interface that increases or decreases the number of threads in the pool as needed. You can submit your parallelStream chain to it as shown here using a ForkJoinPool:

I've created a working example which prints the threads that are doing the work:

import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ForkJoinPool;
import java.util.stream.Collectors;
import java.util.stream.LongStream;

public class TestParallel
{
  public static void main(String... args) throws InterruptedException, ExecutionException
  {
    testParallel();
  }
  
  
  static Long sum(long a, long b)
  {
    System.out.println(Thread.currentThread() + " - sum: " + a + " " + b);
    return a + b;
  }
  
  public static void testParallel() 
      throws InterruptedException, ExecutionException {
        
        long firstNum = 1;
        long lastNum = 10;

        List<Long> aList = LongStream.rangeClosed(firstNum, lastNum).boxed()
          .collect(Collectors.toList());

        System.out.println("custom: ");
        System.out.println();
        
        ForkJoinPool customThreadPool = new ForkJoinPool(4);
        long totalCustom = customThreadPool.submit(
          () -> aList.parallelStream().reduce(0L, TestParallel::sum)).get();
        
        System.out.println();
        System.out.println("standard: ");
        System.out.println();
        
        long totalStandard = aList.parallelStream().reduce(0L, TestParallel::sum);
        
        System.out.println();
        System.out.println(totalCustom + " " + totalStandard);
    }
}

Personally, if you want to get to that level of control, I'm not sure the streaming API is worth bothering with. It's not doing anything you can't do with Executors and concurrent libs. It's just a simplified facade to those features with limited capabilities.

Streams are kind of nice when you need to lay out a simple multi-step process in a little bit of code. But if all you are doing is using them to manage parallelism of tasks, the Executors and ExecutorService are more straightforward IMO. One thing I would avoid is pushing the number of threads above your machine's native thread count unless you have IO-bound processing. And if that's the case NIO is the more efficient solution.

What I'm not sure about is what the logic is that decides when to use multiple threads and when to use one. You'd have to better explain what factors come into play.

2 of 3
2

I don't know if this is useful but there is a design pattern called Bridge that decouples the abstraction from its implementation so you can, at runtime change between implementations.

A simple example would be a stack. For stacks where the total amount of data stored at one time is relatively small, it is more efficient to use an array. When the amount of data hits a certain point, it becomes better to use a linked-list. The stack implementation determines when it switches from one to the other.

For your case, it sounds like the processing would be behind some interface and based on the volume (do you know it before you start the processing?) your Processor class could use streams or parallel streams as appropriate.

🌐
Medium
medium.com › @jhumper68 › java-parallel-in-practice-887b907a42e9
Java parallel in practice. As the title suggests, this article… | by Jhumper | Medium
September 9, 2023 - Some of them are designed for a specific set of tasks, while others seem to be suitable for any. And how to evaluate the performance of the Reactor vs. ExecutorService, or using the Stream API. Suppose you are developing a simple service. You need to fetch data from a source, map it in parallel with another service, and put the result in a third one.
Find elsewhere
🌐
4Comprehension
4comprehension.com › home › parallel collection processing: without parallel streams (1/3)
Parallel Collection Processing: Without Parallel Streams (1/3) - { 4Comprehension }
February 24, 2020 - Before we dig into contemporary techniques, let’s quickly revisit how something like this could be achieved using legacy Java versions. Let’s start with a point of reference, which is how sequential processing would look like: List<Integer> results = new ArrayList<>(); for (Integer integer : integers) { results.add(Utils.process(integer)); } As mentioned above, to be able to process anything asynchronously, we need to have a thread/thread-pool where we could schedule our operations on: ExecutorService executor = Executors.newFixedThreadPool(10);
🌐
Reddit
reddit.com › r/java › fork/join framework vs. parallel streams vs. executorservice: the ultimate fork/join benchmark
r/java on Reddit: Fork/Join Framework vs. Parallel Streams vs. ExecutorService: The Ultimate Fork/Join Benchmark
January 20, 2015 - This benchmark is a little flawed, Fork-Join is just a specialized executor service and parallel streams use fork-join....all that is being compared is the developer's efficiency at using each API type. ... Great link, thank you. ... Polite, but low effort. ... News, Technical discussions, research papers and assorted things of interest related to the Java programming language NO programming help, NO learning Java related questions, NO installing or downloading Java questions, NO JVM languages - Exclusively Java
🌐
Medium
medium.com › @sridinu03 › concurrency-in-java-threads-executors-and-parallel-streams-a4269c662944
Concurrency in Java: Threads, Executors, and Parallel Streams | by Dinushan Sriskandaraja | Medium
March 21, 2025 - Here’s a quick recap: Threads allow manual creation of concurrent tasks. Executors help manage multiple threads efficiently. Parallel Streams provide a simple way to process large datasets in parallel.
🌐
Coderanch
coderanch.com › t › 779357 › java › ParallelStream-jdbc-connection
ParallelStream and jdbc connection (Threads forum at Coderanch)
February 2, 2024 - I am back again I have created a test class, to kind of simulate what we do in real scenario: In the result, having parallel stream make the process much more faster. and Using ExecutorService makes no difference at all. It looks like having no ExecutorService and just invoke parallel stream ...
Top answer
1 of 16
520

There actually is a trick how to execute a parallel operation in a specific fork-join pool. If you execute it as a task in a fork-join pool, it stays there and does not use the common one.

final int parallelism = 4;
ForkJoinPool forkJoinPool = null;
try {
    forkJoinPool = new ForkJoinPool(parallelism);
    final List<Integer> primes = forkJoinPool.submit(() ->
        // Parallel task here, for example
        IntStream.range(1, 1_000_000).parallel()
                .filter(PrimesPrint::isPrime)
                .boxed().collect(Collectors.toList())
    ).get();
    System.out.println(primes);
} catch (InterruptedException | ExecutionException e) {
    throw new RuntimeException(e);
} finally {
    if (forkJoinPool != null) {
        forkJoinPool.shutdown();
    }
}

The trick is based on ForkJoinTask.fork which specifies: "Arranges to asynchronously execute this task in the pool the current task is running in, if applicable, or using the ForkJoinPool.commonPool() if not inForkJoinPool()"

2 of 16
251

The parallel streams use the default ForkJoinPool.commonPool which by default has one less threads as you have processors, as returned by Runtime.getRuntime().availableProcessors() (This means that parallel streams leave one processor for the calling thread).

For applications that require separate or custom pools, a ForkJoinPool may be constructed with a given target parallelism level; by default, equal to the number of available processors.

This also means if you have nested parallel streams or multiple parallel streams started concurrently, they will all share the same pool. Advantage: you will never use more than the default (number of available processors). Disadvantage: you may not get "all the processors" assigned to each parallel stream you initiate (if you happen to have more than one). (Apparently you can use a ManagedBlocker to circumvent that.)

To change the way parallel streams are executed, you can either

  • submit the parallel stream execution to your own ForkJoinPool: yourFJP.submit(() -> stream.parallel().forEach(soSomething)).get(); or
  • you can change the size of the common pool using system properties: System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "20") for a target parallelism of 20 threads.

Example of the latter on my machine which has 8 processors. If I run the following program:

long start = System.currentTimeMillis();
IntStream s = IntStream.range(0, 20);
//System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "20");
s.parallel().forEach(i -> {
    try { Thread.sleep(100); } catch (Exception ignore) {}
    System.out.print((System.currentTimeMillis() - start) + " ");
});

The output is:

215 216 216 216 216 216 216 216 315 316 316 316 316 316 316 316 415 416 416 416

So you can see that the parallel stream processes 8 items at a time, i.e. it uses 8 threads. However, if I uncomment the commented line, the output is:

215 215 215 215 215 216 216 216 216 216 216 216 216 216 216 216 216 216 216 216

This time, the parallel stream has used 20 threads and all 20 elements in the stream have been processed concurrently.

🌐
DEV Community
dev.to › igalhaddad › java-8-parallel-stream-with-threadpool-32kd
Java 8 Parallel Stream with ThreadPool - DEV Community
February 10, 2020 - import com.google.common.base.... 0) throw new IllegalArgumentException("Invalid timeout " + timeout); // see java.util.concurrent.Executors.newWorkStealingPool(int parallelism) ExecutorService threadPool = new ForkJoinPool(parallelism, new NamedForkJoinWorkerThreadFacto...
🌐
Vivablo
vivablo.com › annotation-383-mexico › parallel-stream-vs-executorservice.html
Parallel stream vs executorservice
Lambda operator is not used: foreach ... on multithreading concept: The only difference between stream(). In Parallel Stream task is divided into sub-tasks and run on separate threads to be completed faster....
🌐
Baeldung
baeldung.com › home › java › java streams › when to use a parallel stream in java
When to Use a Parallel Stream in Java | Baeldung
November 10, 2025 - Java 8 introduced the Stream API that makes it easy to iterate over collections as streams of data. It’s also very easy to create streams that execute in parallel and make use of multiple processor cores.
🌐
Baeldung
baeldung.com › home › java › java concurrency › custom thread pools in java parallel streams
Custom Thread Pools in Java Parallel Streams | Baeldung
December 28, 2023 - In this quick tutorial, we’ll look at one of the biggest limitations of Stream API and see how to make a parallel stream work with a custom ThreadPool instance, alternatively – there’s a library that handles this. Let’s start with a simple example – calling the parallelStream method on any of the Collection types – which will return a possibly parallel Stream:
🌐
Stack Abuse
stackabuse.com › java-8-streams-guide-to-parallel-streaming-with-parallel
Java 8 Streams: Definitive Guide to Parallel Streaming with parallel()
October 24, 2023 - The class is a concrete implementation of ExecutorService. And, it extended the interface by adding the aspect of work stealing, thus, setting up parallelism for increased efficiency. With ForkJoinPool, idle tasks aim to relieve busy tasks of some of their load. Starting with Java 8, the aspect of streams has made parallelism idiomatic also.
🌐
Medium
medium.com › @kotiavula6 › we-replaced-executorservice-with-parallel-streams-heres-what-happened-5dfad9ccaf12
We Replaced ExecutorService With Parallel Streams. Here’s What Happened. | by Kavya's Programming Path | Feb, 2026 | Medium
February 15, 2026 - Can’t this be a parallel stream?” · That one sentence turned into a week-long experiment. We had a clean, stable implementation built around ExecutorService. It was predictable. Boring. Reliable. The kind of code that quietly does its job for years. But it was also verbose. Futures. Loops. Shutdown handling. It looked heavy compared to one elegant .parallelStream() chain. So we tried it. ... Same workload. Same machine. Same Java ...
🌐
GitHub
github.com › adityagarde › parallel-asynchronous-java
GitHub - adityagarde/parallel-asynchronous-java: Parallel and Asynchronous Programming in Java. Implementing Threads, Executor Service, ForkJoin, Parallel Streams, and CompletableFuture.
Fork Join Framework is designed to achieve Data Parallelism. ExecutorService is designed to achieve Task Based Parallelism.
Starred by 4 users
Forked by 4 users
Languages   Java 100.0% | Java 100.0%