to_numericで強制変換した列(新規)をつくる¶astype(int)に変換しようとするとエラーpd.to_numeric(s, errors='coerce')で変換して比較したほうが速いdf2
| A | B | C | |
|---|---|---|---|
| ONE | 2 | 1 | @ |
| TWO | 4 | 2 | 6 |
| THREE | 6 | 3 | - |
df2['C_2'] = pd.to_numeric(df2['C'], errors='coerce')
df2
| A | B | C | C_2 | |
|---|---|---|---|---|
| ONE | 2 | 1 | @ | NaN |
| TWO | 4 | 2 | 6 | 6.0 |
| THREE | 6 | 3 | - | NaN |
(df2['C'] == 0).value_counts()
False 3 Name: C, dtype: int64
# ここでは例として新規列「C_3」を作成しているが、上書きしてもよい
df2['C_3'] = df2['C_2'].fillna(0)
df2
| A | B | C | C_2 | C_3 | |
|---|---|---|---|---|---|
| ONE | 2 | 1 | @ | NaN | 0.0 |
| TWO | 4 | 2 | 6 | 6.0 | 6.0 |
| THREE | 6 | 3 | - | NaN | 0.0 |
df2['C_3'] = pd.to_numeric(df2['C_3'], downcast='integer')
display(df2)
print(df2.dtypes)
| A | B | C | C_2 | C_3 | |
|---|---|---|---|---|---|
| ONE | 2 | 1 | @ | NaN | 0 |
| TWO | 4 | 2 | 6 | 6.0 | 6 |
| THREE | 6 | 3 | - | NaN | 0 |
A int64 B object C object C_2 float64 C_3 int8 dtype: object