If you only need to get list of unique values, you can just use unique method.
If you want to have Python's set, then do set(some_series)
In [1]: s = pd.Series([1, 2, 3, 1, 1, 4])
In [2]: s.unique()
Out[2]: array([1, 2, 3, 4])
In [3]: set(s)
Out[3]: {1, 2, 3, 4}
However, if you have DataFrame, just select series out of it ( some_data_frame['<col_name>'] ).
If you only need to get list of unique values, you can just use unique method.
If you want to have Python's set, then do set(some_series)
In [1]: s = pd.Series([1, 2, 3, 1, 1, 4])
In [2]: s.unique()
Out[2]: array([1, 2, 3, 4])
In [3]: set(s)
Out[3]: {1, 2, 3, 4}
However, if you have DataFrame, just select series out of it ( some_data_frame['<col_name>'] ).
With large size series with duplicates the set(some_series) execution-time will evolve exponentially with series size.
Better practice would be to set(some_series.unique()).
Convert dataframe rows into sets
python - Set value on an entire column of a pandas dataframe - Stack Overflow
python - How to set the columns in pandas - Stack Overflow
Convert dataframe rows into sets : r/learnpython
Videos
How can I convert my pandas dataframe into this format?
``` sets items weight value
0 set1 a 9 10
1 set1 b 14 100
2 set2 c 5 69
3 set2 d 4 100
Outcome i'm looking for:
set1 = (("a", 9, 10), ("b", 14, 100))
set2 = (("c", 5, 69), ("d", 4, 100))
print(set1)
set1 = (("a", 9, 10), ("b", 14, 100))
You can use the assign function:
df = df.assign(industry='yyy')
Python can do unexpected things when new objects are defined from existing ones. You stated in a comment above that your dataframe is defined along the lines of df = df_all.loc[df_all['issueid']==specific_id,:]. In this case, df is really just a stand-in for the rows stored in the df_all object: a new object is NOT created in memory.
To avoid these issues altogether, I often have to remind myself to use the copy module, which explicitly forces objects to be copied in memory so that methods called on the new objects are not applied to the source object. I had the same problem as you, and avoided it using the deepcopy function.
In your case, this should get rid of the warning message:
from copy import deepcopy
df = deepcopy(df_all.loc[df_all['issueid']==specific_id,:])
df['industry'] = 'yyy'
EDIT: Also see David M.'s excellent comment below!
df = df_all.loc[df_all['issueid']==specific_id,:].copy()
df['industry'] = 'yyy'
df.reset_index().melt(id_vars='index').drop('variable',1)
Output:
index value
0 Saturday 2540.0
1 Sunday 1313.0
2 Monday 1360.0
3 Tuesday 1089.0
4 Wednesday 1329.0
5 Thursday 798.0
6 Saturday 2441.0
7 Sunday 1891.0
8 Monday 1558.0
9 Tuesday 2105.0
10 Wednesday 1658.0
11 Thursday 1195.0
12 Saturday 3832.0
13 Sunday 2968.0
14 Monday 2967.0
15 Tuesday 2476.0
16 Wednesday 2073.0
17 Thursday 2183.0
18 Saturday 4093.0
19 Sunday 2260.0
20 Monday 2156.0
21 Tuesday 1577.0
22 Wednesday 2403.0
23 Thursday 1287.0
24 Saturday 1455.0
25 Sunday 1454.0
26 Monday 1564.0
27 Tuesday 1744.0
28 Wednesday 1231.0
29 Thursday 1460.0
30 Saturday 2552.0
31 Sunday 1798.0
32 Monday 1752.0
33 Tuesday 1457.0
34 Wednesday 874.0
35 Thursday 1269.0
Note: just noted a commented suggesting to do the same thing, I will delete my post if requested :)
Create it with numpy by reshaping the data.
import pandas as pd
import numpy as np
pd.DataFrame(df.to_numpy().flatten('F'),
index=np.tile(df.index, df.shape[1]),
columns=['items'])
Output:
items
Saturday 2540.0
Sunday 1313.0
Monday 1360.0
Tuesday 1089.0
Wednesday 1329.0
Thursday 798.0
Saturday 2441.0
...
Sunday 1798.0
Monday 1752.0
Tuesday 1457.0
Wednesday 874.0
Thursday 1269.0