Saving the Chart
Being able to render these charts is wonderful; however, at times you need to use them within a presentation. Luckily for us, matplotlib comes with a method that can save the charts we create to a file. The savefig() method supports many different file extensions; the most common “.jpg” is what we’ll use. Let’s render a simple plot line chart to the local folder:
1| # using savefig method to save the chart as a jpg to the local folder
3| x, y = [ 1600, 1700, 1800, 1900, 2000 ] , [ 0.2, 0.5, 1.1, 2.2, 7.7 ]
5| plt.plot(x, y, "bo-") # creates a blue solid line with circle dots
7| plt.title("World Population Over Time")
8| plt.xlabel("Year")
9| plt.ylabel("Population (billions)")
11| plt.savefig("population.jpg")
Go ahead and run the cell. You’ll notice a new image file in the “python_bootcamp” folder called “population.jpg” now. If you don’t specify a URL path, it’ll save the image in the local folder where the Jupyter Notebook file is located.
Note You can save the chart in other formats like pdF or pNG.
Flattening Multidimensional Data
Generally, in data analysis you want to avoid 3D plotting wherever possible. It’s not because the information you want to convey isn’t contained within the result, but sometimes it is simply easier to express a point by other means. One of the best ways to represent a third dimension is to use color instead of depth.
For instance, imagine that you have three datasets that you need to plot: height, weight, and age. You could render a 3D model, but that would be excessive. Instead, you can render the height and weight like we have before on a scatter plot and color each dot to represent the age. The third dimension of color is now easily readable rather than trying to depict the data using the z axis (depth). Let’s create this exact scatter plot together in the following:
1| # creating a scatter plot to represent height-weight distribution
2| from random import randint
3| random.seed(2)
5| height = [ randint(58, 78) for x in range(20) ]
6| weight = [ randint(90, 250) for x in range(20)
7| age = [ randint(18, 65) for x in range(20) ] # 20 records between 18 and 65 years old
9| plt.scatter(weight, height, c=age) # sets the age list to be shown by color
11| plt.title("Height-Weight Distribution")
12| plt.xlabel("Weight (lbs)")
13| plt.ylabel("Height (inches)")
14| plt.colorbar(label="Age") # adds color bar to right side
16| plt.show( )
CHapter 10 INtroduCtIoN to data aNalYsIs
Go ahead and run the cell. By adding the c argument which represents color, into the scatter plot, we can easily represent three datasets in a 2D manner as seen in Figure 10- 8.
The color bar on the right side is created via line 14, where we also create the label for it. In some cases, you do need to use the z axis, like representing spatial data. However, when possible, simply using color as the third dimension is easier to not only create but to read as well.
Do'stlaringiz bilan baham: |