HTC from 12-2011 to 2-2012

 

When doing research in foreign equities, I always use quantmod and R to get quotes. Google does not usually support CSV downloads of foreign quotes, but in most every case, Yahoo does. The "getSymbols()" function in quantmod is fully equipped for this, except for one crucial problem: foreign exchanges often use numbers rather than alphabetical identifiers for ticker symbols, especially in Asia. Examples of this are HTC in Taiwan(2498.TW), NCSoft in Korea (036570.KS), and Ping An in Hong Kong (2318.HK).

This nomenclature has obvious implications for computer languages which will generally disallow variable names that begin with numbers. In R, this is also the case, and quantmod will attempt to create a dataframe with the name of the ticker symbol. This problem took a while for me to discover, since quantmod will not give you an error when you try to retrieve the quote. For instance, let me try to retrieve Apple (AAPL):
> getSymbols("AAPL", src = "yahoo")
[1] "AAPL"

I enter the command "getSymbols()" and when R has finished downloading the data, it echos the name of the dataframe created "AAPL".  Let's try HTC (2498.TW) now:
> getSymbols("2498.tw", src = "yahoo")
[1] "2498.TW"

It would appear as though the data has been downloaded correctly, and if you have automated this process, you wouldn't know the problem.  The only way the problem will present is as a frustrating error:
> length(AAPL)
[1] 4536
> length(2498.TW)
Error: unexpected symbol in "length(2498.TW"

The fix for the problem is very simple or very difficult, depending on your situation.  The solution to the problem is to set "auto.assign=FALSE" when calling getSymbols(). This will allow you to assign the data returned to a variable name of your choice. For example:
> HTC.TW <- getSymbols("2498.TW", src = "yahoo", auto.assign=FALSE)

This code will not echo like the previous examples, since you are explicitly naming the variable yourself.  As I mentioned before, the solution is either simple or complex.  If one were running a long script with many equities from different countries, the lack of errors might cause big problems.  Correcting the names on a large scale would likely rely on using regular expressions or a predefined list of alternate ticker symbols combined with the technique listed above.