There cannot be a simpler solution:
- To know the duplicated ids, you must iterate over the entire collection.
- To print all the persons with duplicated ids, you must keep their full list.
As such, you will need to load the entire collection of persons in memory. There's no way around that. If you needed only the duplicate ids but not the Person objects, then you could keep just the ids with their counts, and throw away the Person objects as you go after use, that would be more efficient. (But that's not the case here.)
In any case, your solution can be more concise if you skip the intermediary map variable with the mapping of ids to lists of users:
people.stream()
.collect(Collectors.groupingBy(Person::getId)).values().stream()
.filter(peopleWithSameId -> peopleWithSameId.size() > 1)
.forEach(peopleWithSameId -> System.out.println("People with identical IDs: " + peopleWithSameId));
Btw, in case you're wondering if the .stream() there could be .parallelStream(), it would be pointless, due to the synchronization in the println method of System.out (a PrintStream). (And without synchronization println wouldn't be thread safe anyway.)
There cannot be a simpler solution:
- To know the duplicated ids, you must iterate over the entire collection.
- To print all the persons with duplicated ids, you must keep their full list.
As such, you will need to load the entire collection of persons in memory. There's no way around that. If you needed only the duplicate ids but not the Person objects, then you could keep just the ids with their counts, and throw away the Person objects as you go after use, that would be more efficient. (But that's not the case here.)
In any case, your solution can be more concise if you skip the intermediary map variable with the mapping of ids to lists of users:
people.stream()
.collect(Collectors.groupingBy(Person::getId)).values().stream()
.filter(peopleWithSameId -> peopleWithSameId.size() > 1)
.forEach(peopleWithSameId -> System.out.println("People with identical IDs: " + peopleWithSameId));
Btw, in case you're wondering if the .stream() there could be .parallelStream(), it would be pointless, due to the synchronization in the println method of System.out (a PrintStream). (And without synchronization println wouldn't be thread safe anyway.)
Your code and Java 8 usage looks fine in general to me.
I do see an issue with the Person class, it looks like you are intending it to be an immutable class, if so, then you should also enforce it.
You need to ensure that the name and id fields can never be changed, you can do this by adding final to them. Your code currently seems to be safe, but it is not. I can extend Person and offer a method there to change the name and id fields, which violates the assumed variant of that those fields in Person are immutable.
Simply changing it to the following will do:
public class Person {
private final String name;
private final String id;
...
}
Onto the Java 8 usage now.
It is a good thing that you use the Collectors.groupingBy to provide a Map<String, List<Person>>, you cannot do it much faster either way if you want it to work with any kind of List<Person> as input and in this way you'll save yourself from nasty bugs and reimplementing what lots of people have already done, namely the grouping by operation.
Printing the offending values using Stream seems fine as well, except that you may rewrite it to look a little bit cleaner, something like this could work:
peopleById.values().stream()
.filter(personList -> personList.size() > 1)
.forEach(personList -> System.out.println("People with identical IDs: " + personList);
This is my personal preference on how to format it though, the only real change is to rename peopleWithSameId to personList, as it is simply a List<Person> and nothing more or less.
You've done a good job overall.
You could use IntStream
IntStream.range(0, listA.size())
.map(index ->
new RecordB(listA.get(index).getId(), listA.get(index).getValue(), listA.get(index).getValue() - (index > 0 ? listA.get(index - 1).getValue() : 0))
)
.collect(Collectors.toList())
"... I would like to avoid the class for loop and the previous_val variable. Any ideas how to do this with streams?"
This is a somewhat un-intuitive approach, I had to looked it up actually.
StackOverflow – Java Stream Using Previous Element in Foreach Lambda.
Typically the use of a stream is to aggregate a set of values, and not necessarily compare and contrast them.
Lesson: Aggregate Operations (The Java Tutorials > Collections).
Here is an example utilizing the Collector class, and the Collector#of method.
Essentially, during the collect, you can retrieve the previous element, from whatever has already been collected.
For the BiConsumer argument, a is your collected elements, thus far.
List<RecordB> l
= listA.stream()
.collect(
Collector.<RecordA, List<RecordB>, List<RecordB>>of(
ArrayList::new,
(a, b) -> {
if (a.isEmpty()) a.add(new RecordB(b.id, b.value, 0));
else {
RecordB x = a.get(a.size() - 1);
a.add(new RecordB(b.id, b.value, b.value - x.value));
}
},
(a, b) -> {
a.addAll(b);
return a;
},
x -> x));
Output
1, 10, 0
2, 15, 5
3, 25, 10
4, 30, 5
On a final note, you may want to get rid of the RecordB class, and just utilize a Map.
Map<RecordA, Integer> m = new LinkedHashMap<>();
RecordA a, b;
m.put(a = listA.get(0), 0);
for (int i = 1, n = listA.size(); i < n; i++)
m.put(b = listA.get(i), -a.value + (a = b).value);
Let's run through each part of the code. First, createSharedListViaStream:
public static List<SchoolObj> createSharedListViaStream(List<SchoolObj> listOne, List<SchoolObj> listTwo)
{
// We create a stream of elements from the first list.
List<SchoolObj> listOneList = listOne.stream()
// We select any elements such that in the stream of elements from the second list
.filter(two -> listTwo.stream()
// there is an element that has the same name and school as this element,
.anyMatch(one -> one.getName().equals(two.getName())
&& two.getSchool().equals(one.getSchool())))
// and collect all matching elements from the first list into a new list.
.collect(Collectors.toList());
// We return the collected list.
return listOneList;
}
After running through the code, it does exactly what you want it to do. Now, let's run through createSharedListViaLoop:
public static List<SchoolObj> createSharedListViaLoop(List<SchoolObj> listOne, List<SchoolObj> listTwo)
{
// We build up a result by...
List<SchoolObj> result = new ArrayList<SchoolObj>();
// going through each element in the first list,
for (SchoolObj one : listOne)
{
// going through each element in the second list,
for (SchoolObj two : listTwo)
{
// and collecting the first list's element if it matches the second list's element.
if (one.getName().equals(two.getName()) && one.getSchool().equals(two.getSchool()))
{
result.add(one);
}
}
}
// We return the collected list
return result;
}
So far, so good... right? In fact, your code in createSharedListViaStream is fundamentally correct; instead, it is your createSharedListViaLoop that may be causing discrepancies in output.
Think about the following set of inputs:
List1 = [SchoolObj("nameA","SchoolX"), SchoolObj("nameC","SchoolZ")]
List2 = [SchoolObj("nameA","SchoolX"), SchoolObj("nameA","SchoolX"), SchoolObj("nameB","SchoolY")]
Here, createSharedListViaStream will return the only element of the first list that appears in both lists: SchoolObj("nameA","SchoolX"). However, createSharedListViaLoop will return the following list: [SchoolObj("nameA","SchoolX"),SchoolObj("nameA","SchoolX")]. More precisely, createSharedListViaLoop will collect the correct object, but it will do so twice. I suspect this to be the reason for the output of createSharedListViaStream to be "incorrect" based on comparison to the output of createSharedListViaLoop.
The reason that createSharedListViaLoop does this duplication is based on the lack of termination of its inner for loop. Although we iterate over all elements of the first list to check if they are present in the second, finding a single match will suffice to add the element to the result. We can avoid redundant element addition by changing the inner loop to the following:
for (SchoolObj one : listOne)
{
for (SchoolObj two : listTwo)
{
if (one.getName().equals(two.getName()) && one.getSchool().equals(two.getSchool()))
{
result.add(one);
break;
}
}
}
Additionally, if you don't want duplicate Objects in your list (by location in memory), you can use distinct like so:
List<SchoolObj> result = ...;
result = result.stream().distinct().collect(Collectors.toList());
As a final caution, the above will keep the results distinct in the following scenario:
List<SchoolObj> list = new ArrayList<>();
SchoolObj duplicate = new SchoolObj("nameC", "schoolD");
listOne.add(duplicate);
listOne.add(duplicate);
list.stream().distinct().forEach(System.out::println);
// prints:
// nameC schoolD
However, it will not work in the following scenario, unless you override the equals method for SchoolObj:
List<SchoolObj> list = new ArrayList<>();
listOne.add(new SchoolObj("nameC", "schoolD"));
listOne.add(new SchoolObj("nameC", "schoolD"));
list.stream().distinct().forEach(System.out::println);
// prints (unless Object::equals overridden)
// nameC schoolD
// nameC schoolD
You can filter in one list if contains in another list then collect.
List<SchoolObj> listCommon = listTwo.stream()
.filter(e -> listOne.contains(e))
.collect(Collectors.toList());
You need to override equals() method in SchoolObj class. contains() method you will uses the equals() method to evaluate if two objects are the same.
@Override
public boolean equals(Object o) {
if (!(o instanceof SchoolObj))
return false;
SchoolObj n = (SchoolObj) o;
return n.name.equals(name) && n.school.equals(school);
}
But better solution is to use Set for one list and filter in another list to collect if contains in Set. Set#contains takes O(1) which is faster.
Set<SchoolObj> setOne = new HashSet<>(listOne);
List<SchoolObj> listCommon = listTwo.stream()
.filter(e -> setOne.contains(e))
.collect(Collectors.toList());
You need to override hashCode() method also along with equals() in SchoolObj class for Set#contains.(assuming name and school can't be null)
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + name.hashCode();
result = prime * result + school.hashCode();
return result;
}
Here you will get details how to override equals and hashCode in a better way
My free StreamEx library allows you to process the pairs of the stream elements using additional pairMap intermediate operation. Like this:
StreamEx.of(input).pairMap((current, next) -> doSomethingWith(current, next));
Where input is a Collection, array or Stream. For example, this way you can easily check whether input is sorted:
boolean isSorted = StreamEx.of(input)
.pairMap((current, next) -> next.compareTo(current))
.allMatch(cmp -> cmp >= 0);
There's also forPairs terminal operation which is a forEach analog to all pairs of input elements:
StreamEx.of(input).forPairs((current, next) -> doSomethingWith(current, next));
These features work nicely with any stream source (either random access or not) and fully support the parallel streams.
One way is to generate an IntStream of the indices, and fetch the List elements by their index. This is only efficient if the List supports random access (i.e. if your List is a LinkedList, it would be a bad idea, since list.get(i) doesn't take constant time).
For example :
IntStream.range(0,list.size()-1).forEach(i -> {
doSomething(list.get(i),list.get(i+1));
});
Another way is to store the last element in an array :
List<Element> list = ...
Element[] arr = new Element[1];
list.stream().forEach(e -> {
if (arr[0] != null)
doSomething(arr[0],e);
arr[0]=e;
});
This will only work for sequential streams.
It is very important to understand that the streaming api provided by Java is not just an API, but a different programming paradigm as well. When you use streams you need to embrace the paradigm. You tell streams what you want to achieve, but not how you want to achieve it. And, as with many other things, using streams is not always the best solution. Sometimes, using plain old foor loops makes the code easier to read and probably perform better, too.
Using Streams
If you want to use streams in this example, you could do something similar as below.
As far as I can tell, you are trying to find elements that (based on some criteria) are the same. In other words, you want to group elements based on the value of a property of the Model object.
Let's say our Model class looks like this:
public class Model {
private int id;
private String name;
public Model(int id, String name) {
this.id = id;
this.name = name;
}
public int getId() {
return id;
}
public String getName() {
return name;
}
}
The example below shows how to group elements of a list of models based on their names.
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;
public class CompareItems {
public static void main(String[] args) {
List<Model> items = Arrays.asList(
new Model(1, "model_1"), new Model(2, "model_2"), new Model(3, "model_1"),
new Model(4, "model_3"), new Model(5, "model_2"), new Model(6, "model_2"));
Map<String, Set<Integer>> result =
items.stream()
.collect(Collectors.groupingBy(Model::getName, Collectors.mapping(Model::getId, Collectors.toSet())));
System.out.println(result);
}
}
When you print the result object, you will get this:
{model_1=[1, 3], model_3=[4], model_2=[2, 5, 6]}
In the result object you have the name of a model and ID-s of all model objects that have the same name.
Depending on what you want to do, you can then apply the changes to each group by iterating over the result object.
If you have two ArrayLists A and B, you can get
- A + B with A.addAll(B)
- A - B with A.removeAll(B)
- A ∩ B with A.retainAll(B)
- A ∪ B with copying A to Set A' then A'.addAll(B)
Then you can use streams to process the result if you need to. However getting the desired subset you are interested in (for example items in common or A ∩ B) is really not something I would consider doing with streams.
You don't need to compare
List<Integer> c = new ArrayList<>(a);
c.removeAll(b);
And if you don't mind loosing the original list data
a.removeAll(b);
Something like this should suffice:
Set<Integer> container = new HashSet<>(ListB);
ListA.stream()
.filter(id -> !container.contains(id))
.forEach(System.out::println);
or non-stream:
Set<Integer> container = new HashSet<>(ListB);
for(Integer id : ListA)
if(!container.contains(id));
System.out.println(id);
You can make the code shorter with filter and count, but using loops is cleaner here. Streams are not always the solution.
return Arrays.asList(
IntStream.range(0, Math.min(a.size(), b.size())).filter(i -> a.get(i) > b.get(i)).count(),
IntStream.range(0, Math.min(a.size(), b.size())).filter(i -> a.get(i) < b.get(i)).count()
);
Java 8 - partitioningBy() & counting()
One of the way to solve this problem with streams performing only one iteration over the given set of data is to make use of the built Collectors partitioningBy() and counting():
public static List<Integer> compareTriplets(List<Integer> a, List<Integer> b) {
return IntStream.range(0, a.size())
.map(i -> a.get(i) - b.get(i))
.filter(i -> i != 0)
.boxed()
.collect(Collectors.collectingAndThen(
Collectors.partitioningBy(i -> i > 0, Collectors.counting()),
map -> Arrays.asList(map.get(true).intValue(), map.get(false).intValue())
));
}
Java 8 - custom Collector
Another option would be to define a custom Collector using static factory method Collector.of(). As well as previous approach, it would allow to process the data using a single stream:
public static List<Integer> compareTriplets(List<Integer> a, List<Integer> b) {
return IntStream.range(0, a.size())
.boxed()
.collect(Collector.of(
() -> new int[]{0, 0},
(int[] score, Integer i) -> {
if (a.get(i) > b.get(i)) score[0]++;
if (b.get(i) > a.get(i)) score[1]++;
},
(int[] left, int[] right) -> {
Arrays.setAll(left, i -> left[i] + right[i]);
return left;
},
arr -> Arrays.asList(arr[0], arr[1])
));
}
Java 12 - Collector teeing()
Another option that would allow to produce the result using a single stream is Java 12 Collector teeing(), which expects two downstream Collectors and a Function which performs a final transformation by merging the results they produced.
public static List<Integer> compareTriplets(List<Integer> a, List<Integer> b) {
return IntStream.range(0, a.size())
.boxed()
.collect(Collectors.teeing(
Collectors.filtering(i -> a.get(i) > b.get(i), Collectors.counting()),
Collectors.filtering(i -> b.get(i) > a.get(i), Collectors.counting()),
(alice, bob) -> List.of(alice.intValue(), bob.intValue())
));
}
Your question’s code does not reflect what you describe in the comments. In the comments you say that all names should be present and the size should match, in other words, only the order may be different.
Your code is
List<Person> people = getPeopleFromDatabasePseudoMethod();
List<String> expectedValues = Arrays.asList("john", "joe", "bill");
assertTrue(people.stream().map(person -> person.getName())
.collect(Collectors.toList()).containsAll(expectedValues));
which lacks a test for the size of people, in other words allows duplicates. Further, using containsAll combining two Lists in very inefficient. It’s much better if you use a collection type which reflects you intention, i.e. has no duplicates, does not care about an order and has an efficient lookup:
Set<String> expectedNames=new HashSet<>(expectedValues);
assertTrue(people.stream().map(Person::getName)
.collect(Collectors.toSet()).equals(expectedNames));
with this solution you don’t need to test for the size manually, it is already implied that the sets have the same size if they match, only the order may be different.
There is a solution which does not require collecting the names of persons:
Set<String> expectedNames=new HashSet<>(expectedValues);
assertTrue(people.stream().allMatch(p->expectedNames.remove(p.getName()))
&& expectedNames.isEmpty());
but it only works if expectedNames is a temporary set created out of the static collection of expected names. As soon as you decide to replace your static collection by a Set, the first solution doesn’t require a temporary set and the latter has no advantage over it.
If the number of elements must be the same, then it would be better to compare sets:
List<Person> people = getPeopleFromDatabasePseudoMethod();
Set<String> expectedValues = new HashSet<>(Arrays.asList("john", "joe", "bill"));
assertEquals(expectedValues,
people.stream().map(Person::getName).collect(Collectors.toSet()));
The equals method for properly implemented sets should be able to compare different types of sets: it just checks whether the contents is the same (ignoring the order of course).
Using assertEquals is more convenient as in case of failure an error message will contain the string representation of your set.
You can just add another condition in you .filter() to only return the results which exists in your searchList once you filter the version.
It's slightly better to convert the searchList to a HashSet as you'll bring down the complexity of searching the companies from O(n) to O(1) and it'll also take care of removing any duplicate values you might have.
It's even better to pass in the HashSet instead of a list (if you have control over the interface design).
Here is a snippet where I'm first converting the searchList to a set and then adding a new condition in .filter() to only return the companies which are present in the searchList.
public List<Company> getCompanies(String country, List<String> searchList, String version) {
// Convert the given search list to a set
final Set<String> searchQueries = new HashSet<>(searchList);
List<Company> result = countriesByCountryCache.getUnchecked(country)
.stream()
.filter(s -> version.compareTo(s.getVersion()) >= 0 && searchQueries.contains(s.getName()))
.collect(Collectors.toList());
return result;
}
public List<Company> getCompanies(String country, List<String> searchList, String version) {
List<Company> result = countriesByCountryCache.getUnchecked(country)
.stream()
.filter(s -> version.compareTo(s.getVersion()) >= 0 && searchList.contains(s.getName())
.collect(Collectors.toList());
return result;
}
Please check if above code works.