Pandas
Frequently used code for pandas related code snippets
plot columns
df.plot(figsize=(20,10))fill na
dt.fillna(0)Reading data from Postgresql
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine("postgresql://user@localhost:5432/mydb")
df = pd.read_sql_query("select * from table", con=engine)Add rows
dt.append({"col1":[1,2,3], "coln":[4,5,6]})pivoting dataframe
df.pivot(columns="col_with_columns",values="col_with_values")getting index of max/min values
df['column'].idxmax()Count instances of values in column
df['target'].value_counts().sort_values(ascending=False)Filtering by column's value len
df[df['itemsets'].map(len) > 2]Iterate by rows
Subsetting columns
or by regex pattern
Parse datetimes
Group by dates
Plot evolution over time
New column A based on column B
To select rows whose column value equals a scalar, some_value, use ==:
To select rows whose column value is in an iterable, some_values, use isin:
To select rows whose column value is in another column array
Combine multiple conditions with &:
Note the parentheses. Due to Python's operator precedence rules, & binds more tightly than <= and >=. Thus, the parentheses in the last example are necessary. Without the parentheses
is parsed as
To select rows whose column value does not equal some_value, use !=:
To select rows whose value is not in some_values
Improve query efficiency by setting an index
Order columns lexicographically
Remove duplicates (distinct)
Check if a field exists
Replacing column values
One-hot encoding
Rename column
Rename index names
Remove column from list
Join dataframes
Iterate by groups
Group data by year
Count unique by group
Collect list in Group by
Get normalized values from groupby count
Generate csv from dataframe
Sum to date column
Replace character in column
References
Last updated
Was this helpful?