julia dataframe select multiple columns - Brave Search

Recommended way to select multiple DataFrame columns parametrically, i.e. not by the names

discourse.julialang.org › t › recommended-way-to-select-multiple-dataframe-columns-parametrically-i-e-not-by-the-names › 74220

using names in such transforms is almost never needed. You can do just: julia> df = DataFrame(reshape(1:30, 3, 10), :auto) 3×10 DataFrame Row │ x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 │ Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 ─────┼─────… Answer from bkamins on discourse.julialang.org

Julia Programming Language

discourse.julialang.org › new to julia

DataFramesMeta selecting multiple columns - New to Julia - Julia Programming Language

November 16, 2020 - Using DataFramesMeta (or others), can I select multiple columns of choice and do some calculations? Here’s a mock data. I would like to subset by column x, and then calculate column-wise means. Here’s the answer. df = DataFrame(x = [1,1,1,2,3,2,3,2,3,], y1 = [2,1,2,1,2,1,2,1,2], y2 = ...

dataframes.juliadata.org › stable › man › working_with_dataframes

Working with DataFrames · DataFrames.jl

Finally, you can use Not, Between, Cols and All selectors in more complex column selection scenarios (note that Cols() selects no columns while All() selects all columns therefore Cols is a preferred selector if you write generic code). Here are examples of using each of these selectors: julia> df = DataFrame(r=1, x1=2, x2=3, y=4) 1×4 DataFrame Row │ r x1 x2 y │ Int64 Int64 Int64 Int64 ─────┼──────────────────────────── 1 │ 1 2 3 4 julia> df[:, Not(:r)] # drop :r column 1×3 DataFrame Row │ x1 x2 y │ Int64 Int64 Int

Videos

12 - How to Select Multiple DataFrame Columns #shorts [Julia & ...

August 27, 2022

Selecting Columns by Multi-Level Index Equivalent in Julia DataFrames ...

December 28, 2025

Julia DataFrames : How to Select Columns - YouTube

November 25, 2017

20 - Splitting a dataframe column into multiple ones #shorts [Julia ...

January 17, 2023

Julia DataFrames: How to Select & Work With Rows - YouTube

November 25, 2017

Julia Programming Language

discourse.julialang.org › new to julia

Recommended way to select multiple DataFrame columns parametrically, i.e. not by the names - New to Julia - Julia Programming Language

using names in such transforms is almost never needed. You can do just: julia> df = DataFrame(reshape(1:30, 3, 10), :auto) 3×10 DataFrame Row │ x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 │ Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 ─────┼─────…

juliabloggers.com › column-selectors-in-dataframes-jl

Column selectors in DataFrames.jl | juliabloggers.com

You can put any valid other column selector inside Not: julia> df[:, Not("a")] 1×5 DataFrame Row │ b x1 x2 y1 y2 │ Int64 Int64 Int64 Int64 Int64 ─────┼─────────────────────────────────── 1 │ 2 3 4 5 6 julia> df[:, Not(r"x")] 1×4 DataFrame Row │ a b y1 y2 │ Int64 Int64 Int64 Int64 ─────┼──────────────────────────── 1 │ 1 2 5 6 julia> df[:, Not(Between(:x1, end))] 1×2 DataFrame Row │ a b │ Int64 Int64 ─────┼────────────── 1 │ 1 2

Juliadatascience

juliadatascience.io › select

Select - Julia Data Science

First, let’s create a dataset with multiple columns: function responses() id = [1, 2] q1 = [28, 61] q2 = [:us, :fr] q3 = ["F", "B"] q4 = ["B", "C"] q5 = ["A", "E"] DataFrame(; id, q1, q2, q3, q4, q5) end responses() Here, the data represents answers for five questions (q1, q2, …, q5) in a given questionnaire. We will start by “selecting” a few columns from this dataset. As usual, we use symbols to specify columns: ... Additionally, we can use Regular Expressions with Julia’s regex string literal.

stackoverflow.com › questions › 32558184 › how-to-select-only-a-subset-of-dataframe-columns-in-julia

How to select only a subset of dataframe columns in julia - Stack Overflow

EDIT 2/7/2021: as people seem to still find this on Google, I'll edit this to say right at the top that current DataFrames (1.0+) allows both Not() selection supported by InvertedIndices.jl and also string types as column names, including regex selection with the r"" string macro. Examples:

julia> df = DataFrame(a1 = rand(2), a2 = rand(2), x1 = rand(2), x2 = rand(2), y = rand(["a", "b"], 2))
2×5 DataFrame
 Row │ a1        a2        x1        x2        y      
     │ Float64   Float64   Float64   Float64   String 
─────┼────────────────────────────────────────────────
   1 │ 0.784704  0.963761  0.124937  0.37532   a
   2 │ 0.814647  0.986194  0.236149  0.468216  a

julia> df[!, r"2"]
2×2 DataFrame
 Row │ a2        x2       
     │ Float64   Float64  
─────┼────────────────────
   1 │ 0.963761  0.37532
   2 │ 0.986194  0.468216

julia> df[!, Not(r"2")]
2×3 DataFrame
 Row │ a1        x1        y      
     │ Float64   Float64   String 
─────┼────────────────────────────
   1 │ 0.784704  0.124937  a
   2 │ 0.814647  0.236149  a

Finally, the names function has a method which takes a type as its second argument, which is handy for subsetting DataFrames by the element type of each column:


julia> df[!, names(df, String)]
2×1 DataFrame
 Row │ y      
     │ String 
─────┼────────
   1 │ a
   2 │ a

In addition to indexing with square brackets, there's also the select function (and its mutating equivalent select!), which basically takes the same input as the column index in []-indexing as its second argument:

julia> select(df, Not(r"a"))
2×3 DataFrame
 Row │ x1        x2        y      
     │ Float64   Float64   String 
─────┼────────────────────────────
   1 │ 0.124937  0.37532   a
   2 │ 0.236149  0.468216  a

Original answer below

As @Reza Afzalan said, what you're trying to do returns an array of strings, while column names in DataFrames are symbols.

Given that Julia doesn't have conditional list comprehension, the nicest thing you could do I guess would be

data[:, filter(x -> x != :column1, names(df))]

This will give you the data set with column 1 removed (without mutating it). You could extend this to checking against lists of names as well:

data[:, filter(x -> !(x in [:column1,:column2]), names(df))]

UPDATE: As Ian says below, for this use case the Not syntax is now the best way to go.

More generally, conditional list comprehensions are also available by now, so you could do:

data[:, [x for x in names(data) if x != :column1]]

As of DataFrames 0.19, seems that you can now do

select(data, Not(:column1))

to select all but the column column1. To select all except for multiple columns, use an array in the inverted index:

select(data, Not([:column1, :column2]))

Julia Programming Language

discourse.julialang.org › general usage

Select multiple columns from pandas dataframe - General Usage - Julia Programming Language

ulia> df = pd.DataFrame(rand(8,5)); julia> get(df[:iloc], (0:py"len"(df)-1,0:2)) PyObject 0 1 2 0 0.957077 0.598744 0.226023 1 0.177671 0.993589 0.037180 2 0.719229 0.271832 0.527947 3 0.538052 0.648124 0.251723 4 0.943631 0.770860 0.725607 5 0.610345 0.91…

dataframes.juliadata.org › stable › man › getting_started

Getting Started · DataFrames.jl

The above operations did not work because when you use : as row selector the :B column is updated in-place, and it only supports storing strings. ... julia> df.B = df.B .== "F" 8-element BitVector: 0 1 1 0 1 0 0 1 julia> df 8×3 DataFrame Row │ A B C │ Int64 Bool Int64 ─────┼───────────────────── 1 │ 1 false 0 2 │ 2 true 0 3 │ 3 true 0 4 │ 4 false 0 5 │ 5 true 0 6 │ 6 false 0 7 │ 7 false 0 8 │ 8 true 0

Find elsewhere

Google Bing Mojeek

stackoverflow.com › questions › 63155661 › multiple-column-selection-on-a-julia-dataframe

select - Multiple column selection on a Julia DataFrame - Stack Overflow

df2 = select(df, Between(:A,:D), Between(:P,:Z))

or

df2 = df[:, All(Between(:A,:D), Between(:P,:Z))]

if you are sure your columns are only from :A to :Z you can also write:

df2 = select(df, Not(Between(:E, :O)))

or

df2 = df[:, Not(Between(:E, :O))]

Finally, you can easily find an index of the column using columnindex function, e.g.:

columnindex(df, :A)

and later use column numbers - if this is something what you would prefer.

In Julia you can also build Ranges with Chars and hence when your columns are named just by single letters yet another option is:

df[:, Symbol.(vcat('A':'D', 'P':'Z'))]

bkamins.github.io › julialang › 2023 › 08 › 18 › selectcolumn.html

DataFrames.jl survey: selecting columns of a data frame based on their values | Blog by Bogumił Kamiński

August 18, 2023 - For example, assume that we want to pick columns that contain missing value. In this case the easiest way to do it is to use the eachcol(df) iterator over columns of our data frame: julia> select(df, any.(ismissing, eachcol(df))) 2×2 DataFrame Row │ a2 b1 │ Int64?

Julia Programming Language

discourse.julialang.org › new to julia

How to select rows from a dataframe and then create a dataframe with multiple columns from the selection - New to Julia - Julia Programming Language

df_test = test_data[test_data.col1 .== "TEST", ["col1", "col3", "col5"]] or if you prefer to generate column names dynamically df_test = test_data[test_data.col1 .== "TEST", string.("col", 1:2:5)] In general data frame indexing has the following form: source_data_frame[row_selector, column_selec…

jkrumbiegel.com › pages › 2021-12-28-new-features-dataframemacros

jkrumbiegel.com – Multi-columns, shortcut strings and subset transformations in DataFrameMacros.jl v0.2

December 28, 2021 - So far, DataFrameMacros.jl only supported statements with single-column specifiers. For example, @select(df, :x + 1) or @combine(df, $column_variable * $2). The expressions :x, $column_variable and $2 all refer to one column each. The underlying source-function-sink expression that DataFrameMacros.jl created was therefore always of the form source => function => sink.

educative.io › answers › how-to-select-a-subset-of-dataframe-columns-in-julia

How to select a subset of DataFrame columns in Julia

We can assign this new DataFrame to a separate variable named df. ... Let’s explain the code provided above. Lines 8–9: We use select() to subset the columns and assign the new DataFrame to a variable named df1 and then we print out df1.

Julia Programming Language

discourse.julialang.org › new to julia

How to select dataframe column by name? - New to Julia - Julia Programming Language

Select a column with df[:, :col] or df[:, "col"].

stackoverflow.com › questions › 29421092 › select-subset-of-rows-of-dataframe-using-multiple-conditions

julia - Select subset of rows of dataframe using multiple conditions - Stack Overflow

This is a Julia thing, not so much a DataFrame thing: you want & instead of &&. For example:

julia> [true, true] && [false, true]
ERROR: TypeError: non-boolean (Array{Bool,1}) used in boolean context

julia> [true, true] & [false, true]
2-element Array{Bool,1}:
 false
  true

julia> df[(df[:A].<5)&(df[:B].=="c"),:]
2x2 DataFrames.DataFrame
| Row | A | B   |
|-----|---|-----|
| 1   | 3 | "c" |
| 2   | 4 | "c" |

FWIW, this works the same way in pandas in Python:

>>> df[(df.A < 5) & (df.B == "c")]
   A  B
1  3  c
2  4  c

I have the same now as https://stackoverflow.com/users/5526072/jwimberley , occurring on my update to julia 0.6 from 0.5, and now using dataframes v 0.10.1.

Update: I made the following change to fix:

r[(r[:l] .== l) & (r[:w] .== w), :] # julia 0.5

r[.&(r[:l] .== l, r[:w] .== w), :] # julia 0.6

but this gets very slow with long chains (time taken \propto 2^chains) so maybe Query is the better way now:

# r is a dataframe
using Query
q1 = @from i in r begin
    @where i.l == l && i.w == w && i.nl == nl && i.lt == lt && 
    i.vz == vz && i.vw == vw && i.vδ == vδ && 
    i.ζx == ζx && i.ζy == ζy && i.ζδx == ζδx
    @select {absu=i.absu, i.dBU}
    @collect DataFrame
end

for example. This is fast. It's in the DataFrames documentation.

bkamins.github.io › julialang › 2021 › 01 › 30 › bang.html

On the bang row selector in DataFrames.jl | Blog by Bogumił Kamiński

January 30, 2021 - Note that for multiple column selection you can alternatively use the select function. The difference between select and indexing is that select returns a data frame even if a single column is selected, e.g. like this: julia> select(df, 1) 3×1 DataFrame Row │ col1 │ Int64 ─────...

julia.school › julia › dataframes

How to use dataframes in Julia

February 20, 2025 - To permanently delete the condition column, do this: ... 4×3 DataFrame Row │ item id kind │ String Int64 String ─────┼─────────────────────────────────── 1 │ Mars Rover 100 Rover 2 │ Venus Explorer 101 Spaceship 3 │ Lunar Rover 102 Rover 4 │ 30% Sun Shade 103 Sun Shade · All we’re doing here is using the DataFrames package’s in-built function select!() to—you guessed it—select the columns we want.

juliadata.github.io › DataFrames.jl › stable › lib › functions

Functions · DataFrames.jl

The cols column selector can be any value accepted as column selector by the names function. Note that mapcols guarantees not to reuse the columns from df in the returned DataFrame. If f returns its argument then it gets copied before being stored. Metadata: this function preserves table-level and column-level :note-style metadata. ... julia> df = DataFrame(x=1:4, y=11:14) 4×2 DataFrame Row │ x y │ Int64 Int64 ─────┼────────────── 1 │ 1 11 2 │ 2 12 3 │ 3 13 4 │ 4 14 julia> mapcols(x -> x.^2, df) 4×2 DataFrame Row │ x y │ Int64 Int64 ─