Archives for posts with tag: R

In my investments class, we have to produce charts and perform technical analysis. Though quantmod has the mucho excellente chartSeries() function, I can't leave well enough alone and decided to try to write some functions that will draw a chart using ggplot and add technical indicators.

I got basic functionality down, but want to continue to add things to the function. call ggChartSeries() and provide an OHLC object from quantmod, along with start and end dates in as.Date() form. It calculates moving averages and after that trims the data series, as opposed to chartSeries(), which has issues with this since it takes the pre-trimmed data as an input.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
require(quantmod)
require(ggplot2)
 
getSymbols('AAPL')
x<-AAPL
start <- Sys.Date()-200
end <- Sys.Date()
 
#Pass an OHLC object into this function
#also pass two dates formatted as.Date()
ggChartSeries <- function(x, start, end){
 
# the below is done redundantly for ease of maintenance later on
#First, strip OHLC data (need to vectorize)
  date <- as.Date(time(x))
  open <- as.vector(Op(x))
  high <- as.vector(Hi(x))
  low <- as.vector(Lo(x))
  close <- as.vector(Cl(x))
 
#Then build the data frame
  xSubset <-data.frame('date'=date,'open'=open,'high'= high,'low'=low,'close'=close)
 
#We want to construct our candlesticks  
  xSubset$candleLower <- pmin(xSubset$open, xSubset$close)
  xSubset$candleMiddle <- NA
  xSubset$candleUpper <- pmax(xSubset$open, xSubset$close)
  xSubset$fill <- ''
  xSubset$fill[xSubset$open < xSubset$close] = 'white'
  xSubset$fill[xSubset$fill ==''] = 'red'
 
#Add Moving Averages
  xSubset$ma200 <- SMA(xSubset$close, 200)
  xSubset$ma50 <- SMA(xSubset$close, 50)
 
#Trim Data
  xSubset <-subset(xSubset, xSubset$date > start & xSubset$date < end)
 
#Graphing Step
  g <- ggplot(xSubset, aes(x=date, lower=candleLower, middle=candleMiddle, upper=candleUpper, ymin=low, ymax=high)) 
  g <- g + geom_boxplot(stat='identity', aes(group=date, fill=fill))
  g <- g + geom_line(aes(x=date, y=ma50))+ geom_line(aes(x=date, y=ma200))
  g 
}
 
#call our graphing function
ggChartSeries(AAPL, start, end)

Todo list:

  • Add titles and labeling
  • Add more TA indicators
  • Tweak colors
  • Add/refine options to the function
  • Add volume bars at the bottom

Working on my personal project today, I had to figure out how to sort out a data frame, and very little straightforward information exists about doing this, so I ended up figuring it out myself from a gaggle of incomplete internet posts.

The key to doing this is using both order() and with(). Order returns the ID numbers of how the vector should look. With allows you to rewrite a new data frame.

My code ended up looking like this:

 

country.trim <- country.trim[with(country.trim, order(-income)),]

I started with my original variable country.trim which was a dataframe containing variables country, income, and debt. I desired to sort the countries by income in descending order. So what we do is sort of "play back" the existing dataframe into its sorted form. with() country.trim, we take the order(). order() will only sort in ascending order, so we make income negative. order() gives us a vector of what the new row names should be, and we play that back into the new country.trim variable.

In the more general case, this is what you want to do:

dataFrame.sorted <- dataFrame.original[ with(dataFrame.original, order(sortCriteria1, sortCriteria2,....)) , ]

Do this where each sortCriteria is one of the members of the data frame. For example, in the example of my code, I don't use order(-country$income). You can have as many sort criteria as you want, and remember to prepend a minus sign to any variables that should be sorted in descending order.

The other week, I posted a simple algorithm to figure out Aumann-Serrano riskiness. The algorithm is slow and not very inventive, so I have been brainstorming all week how to improve it.

convergence illustrated

Convergence for the calculation of A-S Riskiness for weekly AAPL returns

Since we know exactly the value we are trying to reach and the parameters of the output, I figured we could converge on the solution from both sides and arrive at the solution much more quickly.

Thus, I redesigned the algorithm to bounce back and forth between max and min values, dividing by half for each iteration. Here is the source code for my redesigned version of asRisk(). As always, feed it a vector of possible returns. Read the rest of this entry »

Recently, reading an article by Megan McArdle about income inequality, she speculated about the idea that the share of income of the 1% gets worse during a recession. She posted a graph:
Income share of the top 1% in the US from 1913
I wasn't a fan of the graph. The ticks distracted from the data being presented, and recessions were not highlighted on the graph, as they are on graphs from places like FRED. Fortunately the data from the graph is available and we can make a run of it using R and ggplot. Read the rest of this entry »

HTC from 12-2011 to 2-2012

 

When doing research in foreign equities, I always use quantmod and R to get quotes. Google does not usually support CSV downloads of foreign quotes, but in most every case, Yahoo does. The "getSymbols()" function in quantmod is fully equipped for this, except for one crucial problem: foreign exchanges often use numbers rather than alphabetical identifiers for ticker symbols, especially in Asia. Examples of this are HTC in Taiwan(2498.TW), NCSoft in Korea (036570.KS), and Ping An in Hong Kong (2318.HK). Read the rest of this entry »

Recently, in my financial statements analysis class, I had to perform a valuation of Apple Inc. with a number of different valuation methods.  One of the things that made valuation simpler is the lack of long-term debt on Apple's balance sheet.  This simple fact means that Apple's WACC is equal to the cost of equity.

To find the cost of equity, I use CAPM, which states

E(R_i) = R_f + \beta_{i}(E(R_m) - R_f)\,

where E(R_i) is the expected return on capital, after accounting for the market risk premium.  To find the component pieces  R_f,   R_m, and \beta_{i}, I will use R with the quantmod package, and I will also use the PerformanceAnalytics Package, although I will show you how to avoid using it if you choose.

The sourcecode for the project:

?Download betacalc.r
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#Packages required
require(PerformanceAnalytics)
require(quantmod)
require(car)
 
#Here we get the symbols for the SP500 (GSPC), AAPL, and 5yr Treasuries (GS5)
getSymbols("^GSPC", src = "yahoo", from = as.Date("2008-01-01"), to = as.Date("2011-12-31"))
getSymbols("AAPL", src = "yahoo", from = as.Date("2009-01-01"), to = as.Date("2011-12-31"))
getSymbols("GS5", src = "FRED", from = as.Date("2008-12-01"), to = as.Date("2011-12-31"))
 
#Market risk R_m is the arithmetic mean of SP500 from 2009 through 2011
#Riskfree rate is arithmetic mean of 5yr treasuries
marketRisk<- mean(yearlyReturn(GSPC['2009::2011']))
riskFree <- mean(GS5['2009::2011'])
 
#My professor advised us to use weekly returns taken on wednesday
#so I take a subset of wednesdays and use the quantmod function
#weeklyReturn()
AAPL.weekly <- subset(AAPL,weekdays(time(AAPL))=='Wednesday')
AAPL.weekly <- weeklyReturn(AAPL['2009::2011'])
GSPC.weekly <- subset(GSPC,weekdays(time(GSPC))=='Wednesday')
GSPC.weekly <- weeklyReturn(GSPC['2009::2011'])
 
#Here I use PerformanceAnalytics functions for alpha+beta
#Then we calculate Cost of equity using our calculated figures
AAPL.beta <- CAPM.beta(AAPL.weekly,GSPC.weekly)
AAPL.alpha <- CAPM.alpha(AAPL.weekly,GSPC.weekly)
AAPL.expectedReturn <- riskFree + AAPL.beta * (marketRisk-riskFree)
 
#For my graph, I want to show R^2, so we get it from the
#lm object AAPL.reg
AAPL.reg<-lm(AAPL.weekly~GSPC.weekly)
AAPL.rsquared<-summary(AAPL.reg)$r.squared
 
#Lastly, we graph the returns and fit line, along with info
scatterplot(100*as.vector(GSPC.weekly),100*as.vector(AAPL.weekly), smooth=FALSE, main='Apple Inc. vs. S&P 500 2009-2011',xlab='S&P500 Returns', ylab='Apple Returns',boxplots=FALSE)
text(5,-10,paste('y = ',signif(AAPL.alpha,digits=4),' + ',signif(AAPL.beta,digits=5),'x \n R^2 = ',signif(AAPL.rsquared,digits=6),'\nn=',length(as.vector(AAPL.weekly)),sep=''),font=2)

The code is commented, but I will make some additional comments on specific sections to explain the process for those unsure. I apologize for my unstandardized variable names as well!

First of all, I use the getQuotes() function, which has a few sources. In this example, I use Yahoo data for equity data and FRED for information on 5yr Treasuries. For reference, the ticker for retrieving the SP500 on Yahoo is "^GSPC", and the FRED code for 5yr treasuries is "GS5". Other symbols should be self explanatory.

Next is the issue of regression parameters. To find alpha and beta, I use the capm functions of PerformanceAnalytics, but to find R^2 I read it out of the the regression object using

?View Code RSPLUS
1
2
AAPL.reg <- lm(AAPL.weekly~GSPC.weekly)
AAPL.rsquared <- summary(AAPL.reg)$r.squared

It is possible to do this with beta and alpha, however, I did not do this because I did not originally did not start out to find R^2, and turned to PerformanceAnalytics out of convenience.

Finally, I graphed the results and regression line for the benefit of my teacher, the results of which can be seen here:

S&P500 vs. Apple, 2009-2011

S&P500 vs. Apple, 2009-2011