机器学习 - Pandas 练习, 常见功能查阅

news/2024/4/24 10:40:23/文章来源:https://blog.csdn.net/galoiszhou/article/details/130264022

机器学习记录

Pandas

安装 pandas 库:

conda install pandas

数据

git clone https://github.com/KeithGalli/pandas.git

练习

import pandas as pd
data_dir = "/data_dir"
df = pd.read_csv(f'{data_dir}/pandas/pokemon_data.csv')
df.shape
(800, 12)
df.head()
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison4549496565451False
12IvysaurGrassPoison6062638080601False
23VenusaurGrassPoison808283100100801False
33VenusaurMega VenusaurGrassPoison80100123122120801False
44CharmanderFireNaN3952436050651False
df = pd.read_excel(f'{data_dir}/pandas/pokemon_data.xlsx')
df.shape
(800, 12)
df.head()
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison4549496565451False
12IvysaurGrassPoison6062638080601False
23VenusaurGrassPoison808283100100801False
33VenusaurMega VenusaurGrassPoison80100123122120801False
44CharmanderFireNaN3952436050651False
def read():global dfdf = pd.read_csv(f'{data_dir}/pandas/pokemon_data.csv')
read()
df.columns
Index(['#', 'Name', 'Type 1', 'Type 2', 'HP', 'Attack', 'Defense', 'Sp. Atk','Sp. Def', 'Speed', 'Generation', 'Legendary'],dtype='object')
df.Name
0                  Bulbasaur
1                    Ivysaur
2                   Venusaur
3      VenusaurMega Venusaur
4                 Charmander...          
795                  Diancie
796      DiancieMega Diancie
797      HoopaHoopa Confined
798       HoopaHoopa Unbound
799                Volcanion
Name: Name, Length: 800, dtype: object
df['Name']
0                  Bulbasaur
1                    Ivysaur
2                   Venusaur
3      VenusaurMega Venusaur
4                 Charmander...          
795                  Diancie
796      DiancieMega Diancie
797      HoopaHoopa Confined
798       HoopaHoopa Unbound
799                Volcanion
Name: Name, Length: 800, dtype: object
df[['Name', 'HP', 'Speed']]
NameHPSpeed
0Bulbasaur4545
1Ivysaur6060
2Venusaur8080
3VenusaurMega Venusaur8080
4Charmander3965
............
795Diancie5050
796DiancieMega Diancie50110
797HoopaHoopa Confined8070
798HoopaHoopa Unbound8080
799Volcanion8070

800 rows × 3 columns

# 第 [n] 行的数据
df.iloc[1]
#                   2
Name          Ivysaur
Type 1          Grass
Type 2         Poison
HP                 60
Attack             62
Defense            63
Sp. Atk            80
Sp. Def            80
Speed              60
Generation          1
Legendary       False
Name: 1, dtype: object
# 第 [n, m) 行的数据
df.iloc[0:4]
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison4549496565451False
12IvysaurGrassPoison6062638080601False
23VenusaurGrassPoison808283100100801False
33VenusaurMega VenusaurGrassPoison80100123122120801False
# 第 [n] 行, [m] 列的数据
df.iloc[2, 2]
'Grass'
# 行遍历
# for index, row in df.iterrows():
# #     print(index, row)
#     print(index, row['Name'])
0 Bulbasaur
1 Ivysaur
2 Venusaur
3 VenusaurMega Venusaur
4 Charmander
5 Charmeleon
...
796 DiancieMega Diancie
797 HoopaHoopa Confined
798 HoopaHoopa Unbound
799 Volcanion
# 根据字段过滤数据
df.loc[df['Type 1'] == "Grass"]
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison4549496565451False
12IvysaurGrassPoison6062638080601False
23VenusaurGrassPoison808283100100801False
33VenusaurMega VenusaurGrassPoison80100123122120801False
4843OddishGrassPoison4550557565301False
.......................................
718650ChespinGrassNaN5661654845386False
719651QuilladinGrassNaN6178955658576False
720652ChesnaughtGrassFighting881071227475646False
740672SkiddoGrassNaN6665486257526False
741673GogoatGrassNaN123100629781686False

70 rows × 12 columns

# 统计, 只对"数值"类型统计
# count:非空值的数量。
# mean:平均值。
# std:标准差。
# min:最小值。
# 25%:下四分位数。
# 50%:中位数(下四分位数和上四分位数的平均值)。
# 75%:上四分位数。
# max:最大值。
df.describe()
#HPAttackDefenseSp. AtkSp. DefSpeedGeneration
count800.000000800.000000800.000000800.000000800.000000800.000000800.000000800.00000
mean362.81375069.25875079.00125073.84250072.82000071.90250068.2775003.32375
std208.34379825.53466932.45736631.18350132.72229427.82891629.0604741.66129
min1.0000001.0000005.0000005.00000010.00000020.0000005.0000001.00000
25%184.75000050.00000055.00000050.00000049.75000050.00000045.0000002.00000
50%364.50000065.00000075.00000070.00000065.00000070.00000065.0000003.00000
75%539.25000080.000000100.00000090.00000095.00000090.00000090.0000005.00000
max721.000000255.000000190.000000230.000000194.000000230.000000180.0000006.00000
# 排序
df.sort_values('Name')
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
510460AbomasnowGrassIce9092759285604False
511460AbomasnowMega AbomasnowGrassIce90132105132105304False
6863AbraPsychicNaN25201510555901False
392359AbsolDarkNaN65130607560753False
393359AbsolMega AbsolDarkNaN6515060115601153False
.......................................
632571ZoroarkDarkNaN6010560120601055False
631570ZoruaDarkNaN4065408040655False
4641ZubatPoisonFlying4045353040551False
695634ZweilousDarkDragon7285706570585False
794718Zygarde50% FormeDragonGround1081001218195956True

800 rows × 12 columns

# 排序: 倒序
df.sort_values('Name', ascending=False)
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
794718Zygarde50% FormeDragonGround1081001218195956True
695634ZweilousDarkDragon7285706570585False
4641ZubatPoisonFlying4045353040551False
631570ZoruaDarkNaN4065408040655False
632571ZoroarkDarkNaN6010560120601055False
.......................................
393359AbsolMega AbsolDarkNaN6515060115601153False
392359AbsolDarkNaN65130607560753False
6863AbraPsychicNaN25201510555901False
511460AbomasnowMega AbomasnowGrassIce90132105132105304False
510460AbomasnowGrassIce9092759285604False

800 rows × 12 columns

# 排序: 多字段排序
df.sort_values(['Type 1', 'HP'])
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
316292ShedinjaBugGhost190453030403False
230213ShuckleBugRock20102301023052False
462415CombeeBugFlying3030423042704False
603543VenipedeBugPoison3045593039575False
314290NincadaBugGround3145903030403False
.......................................
142131LaprasWaterIce13085808595601False
145134VaporeonWaterNaN130656011095651False
350320WailmerWaterNaN13070357035603False
655594AlomomolaWaterNaN16575804045655False
351321WailordWaterNaN17090459045603False

800 rows × 12 columns

# 排序: 多字段排序, 指定每个字段的排序顺序
df.sort_values(['Type 1', 'HP'], ascending=[0, 1])
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
139129MagikarpWaterNaN2010551520801False
381349FeebasWaterNaN2015201055803False
9790ShellderWaterNaN30651004525401False
10698KrabbyWaterNaN30105902525501False
125116HorseaWaterNaN3040707025601False
.......................................
232214HeracrossMega HeracrossBugFighting8018511540105752False
678617AccelgorBugNaN807040100601455False
734666VivillonBugFlying8052509050896False
698637VolcaronaBugFire8560651351051005False
520469YanmegaBugFlying86768611656954False

800 rows × 12 columns

df['Total'] = df['HP'] + df['Attack'] + df['Defense'] + df['Sp. Atk'] + df['Sp. Def'] + df['Speed']
df
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendaryTotal
01BulbasaurGrassPoison4549496565451False318
12IvysaurGrassPoison6062638080601False405
23VenusaurGrassPoison808283100100801False525
33VenusaurMega VenusaurGrassPoison80100123122120801False625
44CharmanderFireNaN3952436050651False309
..........................................
795719DiancieRockFairy50100150100150506True600
796719DiancieMega DiancieRockFairy501601101601101106True700
797720HoopaHoopa ConfinedPsychicGhost8011060150130706True600
798720HoopaHoopa UnboundPsychicDark8016060170130806True680
799721VolcanionFireWater8011012013090706True600

800 rows × 13 columns

df['Total'] = df['HP'] + df['Attack'] + df['Defense'] + df['Sp. Atk'] + df['Sp. Def'] + df['Speed']
# df.drop 不会修改 df, 需要赋值. 如果删除的 列 不存在, 则会报错
df = df.drop(columns=['Total'])df.head()
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison4549496565451False
12IvysaurGrassPoison6062638080601False
23VenusaurGrassPoison808283100100801False
33VenusaurMega VenusaurGrassPoison80100123122120801False
44CharmanderFireNaN3952436050651False
# 当 axis=0 时,对每一列进行求和,返回一行结果。
# 当 axis=1 时,对每一行进行求和,返回一列结果。
df['Total'] = df.iloc[:, 4:10].sum(axis=1)df.head()
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendaryTotal
01BulbasaurGrassPoison4549496565451False318
12IvysaurGrassPoison6062638080601False405
23VenusaurGrassPoison808283100100801False525
33VenusaurMega VenusaurGrassPoison80100123122120801False625
44CharmanderFireNaN3952436050651False309
read()df['Total'] = df.iloc[:, 4:10].sum(axis=1)# 输出指定列
df = df[['Total', 'HP', 'Defense']]df.head(5)
TotalHPDefense
03184549
14056063
25258083
362580123
43093943
read()df['Total'] = df.iloc[:, 4:10].sum(axis=1)cols = list(df.columns)
print(cols)# 输出指定列
df = df[cols[0:4] + [cols[-1]] + cols[4:-1]]df.head(5)
['#', 'Name', 'Type 1', 'Type 2', 'HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Generation', 'Legendary', 'Total']
#NameType 1Type 2TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison3184549496565451False
12IvysaurGrassPoison4056062638080601False
23VenusaurGrassPoison525808283100100801False
33VenusaurMega VenusaurGrassPoison62580100123122120801False
44CharmanderFireNaN3093952436050651False
# 写文件
# df.to_csv('/tmp/modified.csv', index=False)
# df.to_excel('/tmp/modified.xlsx', index=False)
# 多个字段过滤
df.loc[(df['Type 1'] == 'Grass') & (df['Type 2'] == 'Poison')]
#NameType 1Type 2TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison3184549496565451False
12IvysaurGrassPoison4056062638080601False
23VenusaurGrassPoison525808283100100801False
33VenusaurMega VenusaurGrassPoison62580100123122120801False
4843OddishGrassPoison3204550557565301False
4944GloomGrassPoison3956065708575401False
5045VileplumeGrassPoison49075808511090501False
7569BellsproutGrassPoison3005075357030401False
7670WeepinbellGrassPoison3906590508545551False
7771VictreebelGrassPoison490801056510070701False
344315RoseliaGrassPoison40050604510080653False
451406BudewGrassPoison2804030355070554False
452407RoseradeGrassPoison515607065125105904False
651590FoongusGrassPoison2946955455555155False
652591AmoongussGrassPoison46411485708580305False
# 多个字段过滤
df.loc[(df['Type 1'] == 'Grass') | (df['Type 2'] == 'Poison')]
#NameType 1Type 2TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison3184549496565451False
12IvysaurGrassPoison4056062638080601False
23VenusaurGrassPoison525808283100100801False
33VenusaurMega VenusaurGrassPoison62580100123122120801False
1613WeedleBugPoison1954035302020501False
..........................................
718650ChespinGrassNaN3135661654845386False
719651QuilladinGrassNaN4056178955658576False
720652ChesnaughtGrassFighting530881071227475646False
740672SkiddoGrassNaN3506665486257526False
741673GogoatGrassNaN531123100629781686False

89 rows × 13 columns

new_df = df.loc[(df['Type 1'] == 'Grass') & (df['Type 2'] == 'Poison') & (df['HP'] > 70)]# new_df.to_csv('/tmp/filtered.csv', index=False)# "index" 是 df 的 index. 会重新为 new_df 生成index
new_df = new_df.reset_index()new_df
index#NameType 1Type 2TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
023VenusaurGrassPoison525808283100100801False
133VenusaurMega VenusaurGrassPoison62580100123122120801False
25045VileplumeGrassPoison49075808511090501False
37771VictreebelGrassPoison490801056510070701False
4652591AmoongussGrassPoison46411485708580305False
new_df = df.loc[(df['Type 1'] == 'Grass') & (df['Type 2'] == 'Poison') & (df['HP'] > 70)]# 会重新为 new_df 生成index, 删除 df 保留下来的 index 列
new_df = new_df.reset_index(drop=True)new_df
#NameType 1Type 2TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
03VenusaurGrassPoison525808283100100801False
13VenusaurMega VenusaurGrassPoison62580100123122120801False
245VileplumeGrassPoison49075808511090501False
371VictreebelGrassPoison490801056510070701False
4591AmoongussGrassPoison46411485708580305False
df.loc[df['Name'].str.contains('Mega')]
#NameType 1Type 2TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
33VenusaurMega VenusaurGrassPoison62580100123122120801False
76CharizardMega Charizard XFireDragon63478130111130851001False
86CharizardMega Charizard YFireFlying63478104781591151001False
129BlastoiseMega BlastoiseWaterNaN63079103120135115781False
1915BeedrillMega BeedrillBugPoison495651504015801451False
2318PidgeotMega PidgeotNormalFlying579838080135801211False
7165AlakazamMega AlakazamPsychicNaN590555065175951501False
8780SlowbroMega SlowbroWaterPsychic590957518013080301False
10294GengarMega GengarGhostPoison600606580170951301False
124115KangaskhanMega KangaskhanNormalNaN590105125100601001001False
137127PinsirMega PinsirBugFlying6006515512065901051False
141130GyaradosMega GyaradosWaterDark6409515510970130811False
154142AerodactylMega AerodactylRockFlying615801358570951501False
163150MewtwoMega Mewtwo XPsychicFighting7801061901001541001301True
164150MewtwoMega Mewtwo YPsychicNaN780106150701941201401True
168154MeganiumGrassNaN525808210083100802False
196181AmpharosMega AmpharosElectricDragon6109095105165110452False
224208SteelixMega SteelixSteelGround610751252305595302False
229212ScizorMega ScizorBugSteel6007015014065100752False
232214HeracrossMega HeracrossBugFighting6008018511540105752False
248229HoundoomMega HoundoomDarkFire600759090140901152False
268248TyranitarMega TyranitarRockDark70010016415095120712False
275254SceptileMega SceptileGrassDragon6307011075145851453False
279257BlazikenMega BlazikenFireFighting6308016080130801003False
283260SwampertMega SwampertWaterGround63510015011095110703False
306282GardevoirMega GardevoirPsychicFairy6186885651651351003False
327302SableyeMega SableyeDarkGhost480508512585115203False
329303MawileMega MawileSteelFairy480501051255595503False
333306AggronMega AggronSteelNaN630701402306080503False
336308MedichamMega MedichamFightingPsychic510601008580851003False
339310ManectricMega ManectricElectricNaN575707580135801353False
349319SharpedoMega SharpedoWaterDark5607014070110651053False
354323CameruptMega CameruptFireGround56070120100145105203False
366334AltariaMega AltariaDragonFairy59075110110110105803False
387354BanetteMega BanetteGhostNaN55564165759383753False
393359AbsolMega AbsolDarkNaN5656515060115601153False
397362GlalieMega GlalieIceNaN5808012080120801003False
409373SalamenceMega SalamenceDragonFlying70095145130120901203False
413376MetagrossMega MetagrossSteelPsychic700801451501051101103False
418380LatiasMega LatiasDragonPsychic700801001201401501103True
420381LatiosMega LatiosDragonPsychic700801301001601201103True
426384RayquazaMega RayquazaDragonFlying7801051801001801001153True
476428LopunnyMega LopunnyNormalFighting580651369454961354False
494445GarchompMega GarchompDragonGround70010817011512095924False
498448LucarioMega LucarioFightingSteel6257014588140701124False
511460AbomasnowMega AbomasnowGrassIce59490132105132105304False
527475GalladeMega GalladePsychicFighting6186816595651151104False
591531AudinoMega AudinoNormalFairy5451036012680126505False
796719DiancieMega DiancieRockFairy700501601101601101106True
import re
df.loc[df['Type 1'].str.contains('Fire|grass', regex=True, flags=re.I)]
#NameType 1Type 2TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison3184549496565451False
12IvysaurGrassPoison4056062638080601False
23VenusaurGrassPoison525808283100100801False
33VenusaurMega VenusaurGrassPoison62580100123122120801False
44CharmanderFireNaN3093952436050651False
..........................................
735667LitleoFireNormal3696250587354726False
736668PyroarFireNormal507866872109661066False
740672SkiddoGrassNaN3506665486257526False
741673GogoatGrassNaN531123100629781686False
799721VolcanionFireWater6008011012013090706True

122 rows × 13 columns

import re
df.loc[df['Name'].str.contains('^pi[a-z]', regex=True, flags=re.I)]
#NameType 1Type 2TotalHPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
2016PidgeyNormalFlying2514045403535561False
2117PidgeottoNormalFlying3496360555050711False
2218PidgeotNormalFlying47983807570701011False
2318PidgeotMega PidgeotNormalFlying579838080135801211False
3025PikachuElectricNaN3203555405050901False
136127PinsirBugNaN500651251005570851False
137127PinsirMega PinsirBugFlying6006515512065901051False
186172PichuElectricNaN2052040153535602False
219204PinecoBugNaN2905065903535152False
239221PiloswineIceGround450100100806060502False
438393PiplupWaterNaN3145351536156404False
558499PigniteFireFighting4189093557055555False
578519PidoveNormalFlying2645055503630435False
read()
# 第一个参数满足条件的, 设置第二个参数的列为后面的值
df.loc[df['Type 1'] == 'Grass', 'Type 1'] = 'Flamer'df
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurFlamerPoison4549496565451False
12IvysaurFlamerPoison6062638080601False
23VenusaurFlamerPoison808283100100801False
33VenusaurMega VenusaurFlamerPoison80100123122120801False
44CharmanderFireNaN3952436050651False
.......................................
795719DiancieRockFairy50100150100150506True
796719DiancieMega DiancieRockFairy501601101601101106True
797720HoopaHoopa ConfinedPsychicGhost8011060150130706True
798720HoopaHoopa UnboundPsychicDark8016060170130806True
799721VolcanionFireWater8011012013090706True

800 rows × 12 columns

read()
# 第一个参数满足条件的, 设置第二个参数的列为后面的值
df.loc[df['Type 1'] == 'Grass', ['Type 1', 'Lendary']] = 'TEST VALUE'df
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendaryLendary
01BulbasaurTEST VALUEPoison4549496565451FalseTEST VALUE
12IvysaurTEST VALUEPoison6062638080601FalseTEST VALUE
23VenusaurTEST VALUEPoison808283100100801FalseTEST VALUE
33VenusaurMega VenusaurTEST VALUEPoison80100123122120801FalseTEST VALUE
44CharmanderFireNaN3952436050651FalseNaN
..........................................
795719DiancieRockFairy50100150100150506TrueNaN
796719DiancieMega DiancieRockFairy501601101601101106TrueNaN
797720HoopaHoopa ConfinedPsychicGhost8011060150130706TrueNaN
798720HoopaHoopa UnboundPsychicDark8016060170130806TrueNaN
799721VolcanionFireWater8011012013090706TrueNaN

800 rows × 13 columns

read()
# 第一个参数满足条件的, 设置第二个参数的列为后面的值
df.loc[df['Type 1'] == 'Grass', ['Type 1', 'Lendary']] = ['TEST 1', 'TEST 2']df
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendaryLendary
01BulbasaurTEST 1Poison4549496565451FalseTEST 2
12IvysaurTEST 1Poison6062638080601FalseTEST 2
23VenusaurTEST 1Poison808283100100801FalseTEST 2
33VenusaurMega VenusaurTEST 1Poison80100123122120801FalseTEST 2
44CharmanderFireNaN3952436050651FalseNaN
..........................................
795719DiancieRockFairy50100150100150506TrueNaN
796719DiancieMega DiancieRockFairy501601101601101106TrueNaN
797720HoopaHoopa ConfinedPsychicGhost8011060150130706TrueNaN
798720HoopaHoopa UnboundPsychicDark8016060170130806TrueNaN
799721VolcanionFireWater8011012013090706TrueNaN

800 rows × 13 columns

read()# 分组 求平均数
df.groupby(['Type 1']).mean(numeric_only=True).sort_values('Defense', ascending=False)
#HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
Type 1
Steel442.85185265.22222292.703704126.37037067.51851980.62963055.2592593.8518520.148148
Rock392.72727365.36363692.863636100.79545563.34090975.47727355.9090913.4545450.090909
Dragon474.37500083.312500112.12500086.37500096.84375088.84375083.0312503.8750000.375000
Ground356.28125073.78125095.75000084.84375056.46875062.75000063.9062503.1562500.125000
Ghost486.50000064.43750073.78125081.18750079.34375076.46875064.3437504.1875000.062500
Water303.08928672.06250074.15178672.94642974.81250070.51785765.9642862.8571430.035714
Ice423.54166772.00000072.75000071.41666777.54166776.29166763.4583333.5416670.083333
Grass344.87142967.27142973.21428670.80000077.50000070.42857161.9285713.3571430.042857
Bug334.49275456.88405870.97101470.72463853.86956564.79710161.6811593.2173910.000000
Dark461.35483966.80645288.38709770.22580674.64516169.51612976.1612904.0322580.064516
Poison251.78571467.25000074.67857168.82142960.42857164.39285763.5714292.5357140.000000
Fire327.40384669.90384684.76923167.76923188.98076972.21153874.4423083.2115380.096154
Psychic380.80701870.63157971.45614067.68421198.40350986.28070281.4912283.3859650.245614
Electric363.50000059.79545569.09090966.29545590.02272773.70454584.5000003.2727270.090909
Flying677.75000070.75000078.75000066.25000094.25000072.500000102.5000005.5000000.500000
Fighting363.85185269.85185296.77777865.92592653.11111164.70370466.0740743.3703700.000000
Fairy449.52941274.11764761.52941265.70588278.52941284.70588248.5882354.1176470.058824
Normal319.17346977.27551073.46938859.84693955.81632763.72449071.5510203.0510200.020408
read()# 分组 求和
df.groupby(['Type 1']).sum(numeric_only=True).sort_values('HP', ascending=False)
#HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
Type 1
Water339468071830581708379789873883204
Normal312797573720058655470624570122992
Grass241414709512549565425493043352353
Psychic2170640264073385856094918464519314
Bug230803925489748803717447142562220
Fire170253635440835244627375538711675
Rock172802876408644352787332124601524
Dragon1518026663588276430992843265712412
Electric159942631304029173961324337181444
Ground114012361306427151807200820451014
Dark143022071274021772314215523611252
Ghost155682062236125982539244720591342
Fighting9824188626131780143417471784910
Poison7050188320911927169218031780710
Steel119571761250334121823217714921044
Ice10165172817461714186118311523852
Fairy764212601046111713351440826701
Flying2711283315265377290410222
read()# 分组 求数量
df.groupby(['Type 1']).count().sort_values('HP', ascending=False)
#NameType 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
Type 1
Water11211253112112112112112112112112
Normal9898379898989898989898
Grass7070377070707070707070
Bug6969526969696969696969
Psychic5757195757575757575757
Fire5252245252525252525252
Electric4444174444444444444444
Rock4444354444444444444444
Ghost3232223232323232323232
Ground3232193232323232323232
Dragon3232213232323232323232
Dark3131213131313131313131
Poison2828132828282828282828
Fighting272772727272727272727
Steel2727222727272727272727
Ice2424112424242424242424
Fairy171721717171717171717
Flying44244444444
read()# 每行都设置 count = 1
df['count'] = 1# 分组 求数量
df.groupby(['Type 1']).count()['count']
Type 1
Bug          69
Dark         31
Dragon       32
Electric     44
Fairy        17
Fighting     27
Fire         52
Flying        4
Ghost        32
Grass        70
Ground       32
Ice          24
Normal       98
Poison       28
Psychic      57
Rock         44
Steel        27
Water       112
Name: count, dtype: int64
read()# 每行都设置 count = 1
df['count'] = 1# 分组 求数量
df.groupby(['Type 1', 'Type 2']).count()['count']
Type 1  Type 2  
Bug     Electric     2Fighting     2Fire         2Flying      14Ghost        1..
Water   Ice          3Poison       3Psychic      5Rock         4Steel        1
Name: count, Length: 136, dtype: int64
# 读取大文件
# 分批读取数据, chunksize 行数
for df in pd.read_csv(f'{data_dir}/pandas/pokemon_data.csv', chunksize=5):
#     print('CHUNK DF:::')
#     print(df['Name']);
# 读取大文件
# 分批读取数据, chunksize 行数new_df = pd.DataFrame()
for df in pd.read_csv(f'{data_dir}/pandas/pokemon_data.csv', chunksize=5):new_df = pd.concat([new_df, df])
#     results = df.groupby(['Type 1']).count()
#     new_df = pd.concat([new_df, results])new_df
#NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
01BulbasaurGrassPoison4549496565451False
12IvysaurGrassPoison6062638080601False
23VenusaurGrassPoison808283100100801False
33VenusaurMega VenusaurGrassPoison80100123122120801False
44CharmanderFireNaN3952436050651False
.......................................
795719DiancieRockFairy50100150100150506True
796719DiancieMega DiancieRockFairy501601101601101106True
797720HoopaHoopa ConfinedPsychicGhost8011060150130706True
798720HoopaHoopa UnboundPsychicDark8016060170130806True
799721VolcanionFireWater8011012013090706True

800 rows × 12 columns

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.luyixian.cn/news_show_103227.aspx

如若内容造成侵权/违法违规/事实不符,请联系dt猫网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

一文带你学会如何写一份糟糕透顶的简历

我们每个人几乎都会面对找工作这件事,而找工作或者说求职首先就是要写一份简历。今天狗哥将以一个不同的视角带你写一份无与伦比,糟糕透顶的求职简历,说实话,其实几年前,我就是这么写的。 目录 1. 文件名 2. 基本信…

OpenAI ChatGPT 能取代多少程序员的工作?导致失业吗?

阅读原文:https://bysocket.com/openai-chatgpt-vs-developer/ ChatGPT 能取代多少程序员的工作?导致我们程序员失业吗?这是一个很好的话题,我这里分享下: 一、ChatGPT 是什么?有什么作用 ChatGPT是一种…

关于 OpenShift(OKD) 网络 Service、Routes的一些笔记

写在前面 参加考试,分享一些学习 OpenShift 的笔记博文内容为 OpenShift 网络相关组件 Service、Routes 很浅的一些认识学习环境为 openshift v3 的版本,有些旧这里如果专门学习 openshift ,建议学习 v4 版本理解不足小伙伴帮忙指正 傍晚时分…

开源 AI 辅助编程工具 AutoDev 现已上架 Jetbrains 插件市场

我们非常高兴地宣布 AutoDev v0.2.0 的发布!AutoDev 是一款强大的 AI 辅助编程工具,可以与 Jetbrains 系列 IDE 无缝集成(VS Code 支持正在开发中)。通过与需求管理系统(如 Github Issue 等)直接对接&#…

Vue收集表单数据学习笔记

收集表单数据 v-model双向数据绑定,收集的是input框的value,单选按钮不存在value,就像代码中的男女选项,即使绑定性别v-model“sex”,控制台依然不能接收性别的值,因为没有value值,&#xff0c…

欧几里得算法、扩展欧几里得算法(特解、应用、通解)

文章目录 1. 欧几里得算法(也叫辗转相除法)1.1 直接上模拟1.2 几何理解1.3 用代数方法证明 g c d ( a , b ) g c d ( b , a % b ) gcd(a, b) gcd(b, a \% b) gcd(a,b)gcd(b,a%b)1.3.1 左推右: g c d ( a , b ) g c d ( b , a % b ) gcd(a…

Handbook of MusicPsychology 音乐心理学手册 ( 多纳德·霍杰斯 Donald.A.Hodges) 笔记

由两个以上的音组成的结合音,除了该声波的波形,人耳会另外脑补出不存在的波形 频率相距较远的一些音与频率相距较近的一些音,前者累加的响度比后者要大 除了泛音部分,音的起声部分也是音色辨别的关键 音高、响度、音色、时值&a…

LINUX的系统管理与维护命令

文章目录 一、LINUX的系统管理与维护命令总结 一、LINUX的系统管理与维护命令 - Linux ls命令:显示指定工作目录下的内容 Linux pwd命令:显示当前工作目录 Linux cd命令:切换工作目录 Linux date命令:显示或设置系统时间 Linux su命令:切换用户 Linux clear命令:清除屏幕 Li…

Java编程设计语言-集合类

API(application programming interface)是JDK的重要组成部分,API提供了Java程序与运行它的系统软件(Java虚拟机)之间的接口,可以帮助开发者方便、快捷地开发Java程序 集合在程序设计中是一种重要的是数据结构,Java中提…

Semantic Kernel 知多少 | 开启面向 AI 编程新篇章

在 ChatGPT 火热的当下, 即使没有上手亲自体验,想必也对 ChatGPT 的强大略有耳闻。当一些人在对 ChatGPT 犹犹豫豫之时,一些敏锐的企业主和开发者们已经急不可耐地开展基于 ChatGPT 模型 AI 应用的落地探索。 因此,可以明确预见的是&#xf…

Java+Angular开发的医院信息管理系统源码,系统部署于云端,支持多租户

云HIS系统源码,采用云端SaaS服务的方式提供 基于云计算技术的B/S架构的云HIS系统源码,采用云端SaaS服务的方式提供,使用用户通过浏览器即能访问,无需关注系统的部署、维护、升级等问题,系统充分考虑了模板化、配置化、…

系统分析师之软件工程(十二)

目录 一、 软件开发生命周期 1.1 开发阶段工作细分 二、软件开发模型 2.1 瀑布模型 2.2 原型模型 2.3 增量模型与螺旋模型 2.4 V模型 2.5 喷泉模型 2.6 快速应用开发模型RAD 2.7 构件主装模型 2.8 统一过程 2.9 敏捷方法 三、逆向工程 四、净室软件工程 一、 软件…

斯坦福| ChatGPT用于生成式搜索引擎的可行性

文|智商掉了一地 随着 ChatGPT 在文本生成领域迈出了重要一步,Bing 浏览器也接入了聊天机器人功能,因此如何保证 Bing Chat 等搜索引擎结果的精确率和真实性也成为了搜索领域的热门话题之一。 当我们使用搜索引擎时,往往希望搜索结…

电子阅读器市场角力,AI成为关键变量

配图来自Canva可画 近年来,随着国家“书香型社会”建设政策的出台,公众的阅读需求正在逐年增加,各类读书产品和读书活动,也如同雨后春笋般涌现,人们的阅读体验日益得到丰富。比如,昨天世界读书日举行的“不…

更简单的存取Bean方式-@Bean方法注解

1.Bean方法存储 类注解是添加在某个类上的,那么方法注解是添加在某个方法前的 public class UserBeans {Beanpublic User user1(){User user new User();user.setUid(001);user.setUname("zhangsan");user.setAge(19);user.setPassword("123123");retur…

【分布式搜索引擎ES01】

分布式搜索引擎ES 分布式搜索引擎ES1.elasticsearch概念1.1.ES起源1.2.倒排索引1.2.1.正向索引1.2.2.倒排索引 1.3.es的一些概念1.3.1.文档和字段1.3.2.索引和映射1.3.3.mysql与elasticsearch 1.4.1安装es、kibana、IK分词器1.4.2扩展词词典与停用词词典 2.索引库操作2.1.mappi…

Springcloud连接nacos集群,nacos地址配置为nginx,报错:requst nacos server failed

先说下版本: Spring cloud: Hoxton.SR12 spring.cloud.alibaba: 2.2.9.RELEASE spring.boot: 2.3.12.RELEASE Linux Centos7 nacos-server:2.1.0 nginx: 1.20.2 环境说明: nacos正常搭建三个集…

supervisor安装

说明 Supervisor翻译过来是监管人,在Linux中Supervisor是一个进程管理工具,当进程中断的时候Supervisor能自动重新启动它。可以运行在各种类Linux/unix的机器上,supervisor就是用Python开发的一套通用的进程管理程序,能将一个普通…

【虚幻引擎】UE4/UE5科大讯飞文字合成语音

一、链接地址 链接:https://pan.baidu.com/s/15Qoc48x3DLpw4eW1qHXInQ 提取码:jqpx B站视频链接:https://space.bilibili.com/449549424?spm_id_from333.1007.0.0 二、案例介绍 第一步:首先进入讯飞开放平台注册一个账号&…

ThreadPoolExecutor源码阅读流程图

1.创建线程池 public ThreadPoolExecutor(int corePoolSize,int maximumPoolSize,long keepAliveTime,TimeUnit unit,BlockingQueue<Runnable> workQueue) {this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,Executors.defaultThreadFactory(), def…