It looks like your data.year
array is not in any particular order. When you put it into a scatter plot, that doesn't really matter. However, when you are using that array to overlay an average line, than you need it to be in numerical (in this case chronological) order. Try the following:
plt.plot(np.sort(data.year), np.polyval(p, np.sort(data.year), 'r-')
This should connect all of the lines in the appropriate order, forming one single curve.