I think you need if is necessary convert MultiIndex to Index:
df.columns = df.columns.map(''.join)
Or if need remove level use droplevel:
df.columns = df.columns.droplevel(0)
If need access to values is possible use xs:
df = df.xs('CID', axis=1, level=1)
You can also check:
What is the difference between size and count in pandas?
EDIT:
For remove MultiIndex is another solution select by ['FID'].
df = df.groupby(by=['CID','FE'])['FID'].count().unstack().reset_index()
Samples (also added rename_axis for nicer output):
df = pd.DataFrame({'CID':[2,2,3],
'FE':[5,5,6],
'FID':[1,7,9]})
print (df)
CID FE FID
0 2 5 1
1 2 5 7
2 3 6 9
df = df.groupby(by=['CID','FE'])['FID']
.count()
.unstack()
.reset_index()
.rename_axis(None, axis=1)
print (df)
CID 5 6
0 2 2.0 NaN
1 3 NaN 1.0
Answer from jezrael on Stack Overflowpython - Pandas: getting rid of the multiindex - Stack Overflow
python - Pandas: drop a level from a multi-level column index? - Stack Overflow
DF with MultiIndex: how to drop rows where at least one column is all null?
pandas - How to remove multi index from dataframe in python? - Stack Overflow
Videos
I think you need if is necessary convert MultiIndex to Index:
df.columns = df.columns.map(''.join)
Or if need remove level use droplevel:
df.columns = df.columns.droplevel(0)
If need access to values is possible use xs:
df = df.xs('CID', axis=1, level=1)
You can also check:
What is the difference between size and count in pandas?
EDIT:
For remove MultiIndex is another solution select by ['FID'].
df = df.groupby(by=['CID','FE'])['FID'].count().unstack().reset_index()
Samples (also added rename_axis for nicer output):
df = pd.DataFrame({'CID':[2,2,3],
'FE':[5,5,6],
'FID':[1,7,9]})
print (df)
CID FE FID
0 2 5 1
1 2 5 7
2 3 6 9
df = df.groupby(by=['CID','FE'])['FID']
.count()
.unstack()
.reset_index()
.rename_axis(None, axis=1)
print (df)
CID 5 6
0 2 2.0 NaN
1 3 NaN 1.0
This should get rid of MultiIndex for CID and allow you to access it via df['CID']
df = df.rename(columns={('CID',''):'CID'})
You can use MultiIndex.droplevel:
Copy>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> df = pd.DataFrame([[1,2], [3,4]], columns=cols)
>>> df
a
b c
0 1 2
1 3 4
[2 rows x 2 columns]
>>> df.columns = df.columns.droplevel()
>>> df
b c
0 1 2
1 3 4
[2 rows x 2 columns]
As of Pandas 0.24.0, we can now use DataFrame.droplevel():
Copycols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
df = pd.DataFrame([[1,2], [3,4]], columns=cols)
df.droplevel(0, axis=1)
# b c
#0 1 2
#1 3 4
This is very useful if you want to keep your DataFrame method-chain rolling.
I have a MultiIndex where the outer level is an ID and the inner level is a timestamp. I have various columns denoted measurements per ID per timestamp. If any of these measurements (columns) are completely null (for that ID), all the records for that ID are useless. This is because I can't use the measurements from other timestamps to impute the missing values for that ID.
How do I go about checking if there's "at least one column for which all values are null for this ID", and discarding all rows of that ID?
EDIT:
Found a workaround by creating a grouped-by-ID version of the table and using that as a lookup table to see which rows to delete, but if there's a way to do this in a single step that's smarter lmk.