1 Series

线性的数据结构, series是一个一维数组

Pandas 会默然用0到n-1来作为series的index, 但也可以自己指定index( 可以把index理解为dict里面的key )

1.1创造一个serise数据

import pandas as pd
import numpy as np
​
s = pd.Series([9, 'zheng', 'beijing', 128])
​
print(s)
  • 打印

打印

0          9
1      zheng
2    beijing
3        128
dtype: object
  • 访问其中某个数据

访问其中某个数据

print(s[1:2])
​
# 打印
1    zheng
dtype: object

1.2 指定index

import pandas as pd
import numpy as np
​
s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])
​
print(s)
  • 打印

打印

1          9
2      zheng
3    beijing
e        128
f        usa
g        990
dtype: object
  • 根据索引找出值

print(s['f'])    # usa

1.3 用dictionary构造一个series

import pandas as pd
import numpy as np
​
s = {"ton": 20, "mary": 18, "jack": 19, "car": None}
​
sa = pd.Series(s, name="age")
​
print(sa)
  • 打印

car      NaN
jack    19.0
mary    18.0
ton     20.0
Name: age, dtype: float64
  • 检测类型

print(type(sa))    # <class 'pandas.core.series.Series'>

1.4 用numpy ndarray构造一个Series

#生成一个随机数

import pandas as pd
import numpy as np
​
num_abc = pd.Series(np.random.randn(5), index=list('abcde'))
num = pd.Series(np.random.randn(5))
​
print(num)
print(num_abc)
​
# 打印
0   -0.102860
1   -1.138242
2    1.408063
3   -0.893559
4    1.378845
dtype: float64
a   -0.658398
b    1.568236
c    0.535451
d    0.103117
e   -1.556231
dtype: float64

1.5 选择数据

import pandas as pd
import numpy as np
​
s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])
​
print(s[1:3])  # 选择第1到3个, 包左不包右 zheng beijing
print(s[[1,3]])  # 选择第1个和第3个, zheng 128
print(s[:-1]) # 选择第1个到倒数第1个, 9 zheng beijing 128 usa

1.6 操作数据

import pandas as pd
import numpy as np
​
s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])
​
sum = s[1:3] + s[1:3]
sum1 = s[1:4] + s[1:4]
sum2 = s[1:3] + s[1:4]
sum3 = s[:3] + s[1:]
​
print(sum)
print(sum1)
print(sum2)
print(sum3)

#打印

2        zhengzheng
3    beijingbeijing
dtype: object
2        zhengzheng
3    beijingbeijing
e               256
dtype: object
2        zhengzheng
3    beijingbeijing
e               NaN
dtype: object
1               NaN
2        zhengzheng
3    beijingbeijing
e               NaN
f               NaN
g               NaN
dtype: object

1.7 查找

  • 范围查找
    import pandas as pd
    import numpy as np
     
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
     
    sa = pd.Series(s, name="age")
     
    print(sa[sa>19])
    

  • 中位数
    import pandas as pd
    import numpy as np
     
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
     
    sa = pd.Series(s, name="age")
     
    print(sa.median())  # 20
    

     

  • 判断是否大于中位数
    import pandas as pd
    import numpy as np
     
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
     
    sa = pd.Series(s, name="age")
     
    print(sa>sa.median())
    

  • 1.9 满足条件的统一赋值

    import pandas as pd
    import numpy as np
     
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
     
    sa = pd.Series(s, name="age")
     
    print(s) # 打印原字典
     
    print('---------------------')   # 分割线
     
    sa[sa>19] = 88 # 将所有大于19的同一改为88
     
    print(sa) # 打印更改之后的数据
     
    print('---------------------')   # 分割线
     
    print(sa / 2) # 将所有数据除以2
    

Logo

有“AI”的1024 = 2048,欢迎大家加入2048 AI社区

更多推荐