0%

使用Numpy与Pandas进行简单数据处理

1
from pandas import DataFrame, Series
2
import numpy
3
4
countries = ['Russian Fed.', 'Norway', 'Canada', 'United States',
5
            'Netherlands', 'Germany', 'Switzerland', 'Belarus',
6
            'Austria', 'France', 'Poland', 'China', 'Korea', 
7
            'Sweden', 'Czech Republic', 'Slovenia', 'Japan',
8
            'Finland', 'Great Britain', 'Ukraine', 'Slovakia',
9
            'Italy', 'Latvia', 'Australia', 'Croatia', 'Kazakhstan']
10
11
gold = [13, 11, 10, 9, 8, 8, 6, 5, 4, 4, 4, 3, 3, 2, 2, 2, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
12
silver = [11, 5, 10, 7, 7, 6, 3, 0, 8, 4, 1, 4, 3, 7, 4, 2, 4, 3, 1, 0, 0, 2, 2, 2, 1, 0]
13
bronze = [9, 10, 5, 12, 9, 5, 2, 1, 5, 7, 1, 2, 2, 6, 2, 4, 3, 1, 2, 1, 0, 6, 2, 1, 0, 1]

显示奖牌数据:

1
olympic_medal_counts_df = DataFrame(
2
         {'country_name': countries,
3
          'gold': gold,
4
          'silver': silver,
5
          'bronze': bronze}) 
6
print olympic_medal_counts_df
7
8
df = olympic_medal_counts_df

output>>

upload successful

计算至少获得一枚金牌的国家所获得铜牌的平均数:

1
avg_bronze_at_least_one_gold = numpy.mean(df[df.gold > 0].bronze)
2
print  avg_bronze_at_least_one_gold

output>>4.2380952381

计算金牌、银牌与铜牌的平均数:

1
avg_medal_count=numpy.mean(df[['gold','silver','bronze']])
2
print avg_medal_count

output>>

upload successful

如果收获一枚金牌得4分,一枚银牌得2分,一枚铜牌得1分,计算所有国家的总分数

1
df['points'] = df[['gold','silver','bronze']].dot([4,2,1])
2
olympic_points_df = df[['country_name','points']]
3
print olympic_points_df

output>>

upload successful