同樣,想看課堂講義請參考:R Graphics with Ggplot2 - Day1
想要更深入的可以參考這個網站:R for Data Science
上篇請看這:Ggplot | Point plot & Box plot
3. Histogram: geom_histogram()
Histogram 呈現的是單一數字參數的分佈情形,也就是只有 X-axis 是 variable,Y-axis 則是 X-axis 的數據。
ggplot(mpg, aes(cty)) + geom_histogram() |
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi3OxB9CrTVev-CNql3fEPy6ds0JOVtwjvBB3g8XJI0Aauy-aj6NC_x8Htkf5FwOofuIUn8YINI5_xFCkbQabdrUEQ657oc1DcrC2LbLkH_J6gzZUarYe472bVYK4KjB5xO_rEAZU9hyOc/s1600/Histogram+1_cty.png)
可以改每條的寬度,用 bins = 或是 binwidth =,放在 geom_histogram() 裡面。
bins 是數字越小越寬,預設是 30,我覺得剛剛好的寬度大概是 25 左右。
ggplot(mpg, aes(cty)) + geom_histogram(bins = 25, color = 'skyblue') |
可以指定顏色,要放在 geom_histogram() 裡面,放在 ggplot() 裡面沒作用。
單用 color =' ' 的話是外筐的顏色。
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRLpJ5ZcDGI6Vy883SXG1fcqrQGzbQcaET79KlJMCDqHGFt4U71VkQ37Odq7pnD_xIxPQOcFiayYf0ExnUb07TErT8ebvB27KLGtF1BBWHRHdFN3v7vCUjca-CVoNONwQ4qnWskgkmNPw/s1600/Histogram+2_cty_bins25_blue.png)
binwideth 的話是數字越大越寬,預設是 1。
指定顏色一樣是放後面,color = ' ' 只有外筐,裡面顏色是用 fill = ' ' 。
ggplot(mpg, aes(cty)) + geom_histogram(binwidth = 1.5, color = 'pink', fill = 'skyblue') |
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5lqRZip9AGTGIZu-3R4ii_8ISBZQKrPp6JjLQ47OFUTFcYOzoKkNAAq-Ta1HYsh2yfaA4WLOYokbWYL6Qh06oY8jg-wQdMCcem3BEhHmu7hxRNfnjkIEMpSGO4_ifmrZ3FaLacZ3Bbv0/s1600/Histogram+3_cty_binwidth_color-pink_fill-skyblue.png)
4. Frequency poligons: geom_freqpoly()
這個和 histogram 是一樣的,只是它是線條。
寬度的變化一樣是用 bins 或 binwidth。
ggplot(mpg, aes(cty)) + geom_freqpoly() |
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgy90DSKJE5CqTLjCDYaxXOwXLCuJ3r68IXzJ_GE_HXXTGGgMQY2qFdsqo-A_OrSH46ZER9QJOKzK1t68zwRmctukwcLH3hJNo7dCghkYMmyb8o6_LpkW9I5WXcGHBXvOm-2tx4GyXyIU0/s1600/Frepoly_cty.png)
5. Bar graph: geom_bar()
和 histogram 不一樣的是 bar graph 的 X-axis 需要是 text (character),例如 drv, class 和 manufacturer。顏色一樣是放後面。
ggplot(mpg, aes(class)) + geom_bar(color = 'yellow', fill = 'skyblue') |
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1nWUxfVvfxeriloRy_C-0SPhNFefZYyPEQLElxQr_DS2FzjWq9yokQDg5iCZ2VTMcyEU1TBnjWWklfspWRGirIIhLFjnj2OiwNrJ6K4-3extIMZK3JE9h7nyC-H8aCxGUIqV1dfa7GsE/s1600/Bar_class.png)
6. Line graph: geom_line()
一連串的數字通常是用線圖,例如經濟成長圖、失業或就業率圖等等。
這裡用檔案資料 economics:data(economics)。
ggplot(economics, aes(date, unemploy)) + geom_line(color = 'navy') |
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimCiGt5Pu4Aci-wFiNFVKso_sToA4rW3Qs1qXkQFQBBzNGpqC4vu4Hd-IcjvNr_Dr29GzR0pR5MIotyhbu45-paFQc3OtiRvG9Y3KFlHIgtG1FIWDMJ_2ZVogJEMz7E2Wx14gTQJjQRF4/s1600/Line_economics_x-date+y-unemploy.png)
也可以兩種一起呈現,例如 line + point。
ggplot(economics, aes(date, unemploy)) + geom_point(color = 'pink') + geom_line(color = 'navy') |
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFQithPHdfr9Ufk1oSUTxVAt7sAadGoKwg_PC4Os_ORg6TA7j_Ss7Ibb_kvYKghXHItiiovxfYxd3vwzHmBOvt3SDhi2IG_MNFhAWQMBfUNEHZieoQceqozYSljVO6OGJ4W6q6QTr9fSA/s1600/Line-Point_economics_x-date+y-unemploy.png)
最後有兩個練習,有一個要用到檔案 diamonds,有興趣的可以試試。
Exercise 1: How is the drive train related to engine size and vehicle class?
ggplot(mpg, aes(displ, class, colour = drv)) + geom_point() |
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4SVkAzSVM7VIZXYAZ9iAg7h9il4JzQ7vO06BLEolHk1MMLjewQosRMdUB993gxDVjHOQkPNYHOeYs8oxLkah_3DQPRGMaFsgBmrFoJmRUXig1FeN68j3k5y1yRwu9m0ThqbFOew9ayKk/s1600/Point+8_x-displ+y-class_colour-drv.png)
Exercise 2: Exercise: How does the price distribution in diamonds vary by cut?
data(diamonds) ggplot(diamonds, aes(price, colour = cut)) + geom_freqpoly(bins = 50) |
![](https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieuvcj8_sOCeAz-NkSipYqxkWOnmZ3lPMVWz3hmoYIleOdk0tKMDsYnlOEiT6iFI91AO9UBPLVVH7gHX20LwKouff0s44W8ODKGx-kJDp6MIYA_No64AE3Ky3MM8OznaJ3TRwRcKpwq-M/s1600/Freqpoly_diamonds_price_color-cut.png)
圖的部分大概是這六種,接下來下篇要介紹的是加趨勢線。
【 其他參考資料 】
R Graphics Cookbook - Chapter 3: Bar Graphs
沒有留言:
張貼留言
歡迎發表意見