Welcome to NexusFi: the best trading community on the planet, with over 150,000 members Sign Up Now for Free
Genuine reviews from real traders, not fake reviews from stealth vendors
Quality education from leading professional traders
We are a friendly, helpful, and positive community
We do not tolerate rude behavior, trolling, or vendors advertising in posts
We are here to help, just let us know what you need
You'll need to register in order to view the content of the threads and start contributing to our community. It's free for basic access, or support us by becoming an Elite Member -- see if you qualify for a discount below.
-- Big Mike, Site Administrator
(If you already have an account, login at the top of the page)
I have daily Excel spreadsheets of a stock database (db) that includes both price and fundamental information from the close of the day before. I want to create a db of this information that I will then run my backtests from. I plan to use Matlab to both create the database (save as a *.mat file) and backtest from.
My question is: what is the most efficient way to set up the database for testing?
1) Should I create a sheet for each stock where each row contains the daily information?
2) Should I create a sheet for ever days data?
3) Is there another option I am not considering?
4) How often do you backup your bd? Do you overwrite it after a set amount of time?
This is an area I have no expertise in. I'd like to hear the pros/cons of each method and from others that have created db's? How have you set yours up and what would you do different if starting over?
Obviously stock splits, dividends, de-listing, etc. will be issues to be addressed later.
Thank you in advance!
Can you help answer these questions from other members on NexusFi?
I've been writing code in R for only one day, but here is my first script.
Goal:
- Download list of all tickers from Yahoo for AMEX, NASDAQ, NYSE exchanges
- Sort that list by market cap, filter to only the top 100 tickers
- Retrieve …
So I've written several posts about this all over the place but I keep getting questions and PM's so it makes more sense for this to have its own thread.
First, I need to tell you that while I have no problem sharing a great deal, …
My own platform that does everything for me. There are a few others on the forum that have done, or are doing, similar things.
I am using a SQL database and R for 90% of my work. Although i do have a mongoDB database that holds some news / non numerical data. C# is used for faster number crunching or if R is just too slow and then I pass the results in R to use performanceanalytics package or something to look at the results.
Once you get to larger and larger datasets you are going to have to look at better database solutions like KDB.
I trade commodity futures so my design was based on forward curves and the ability to create continuous forward curves based on various parameters. I also had to design for spreads.
To speed up the initial development, I created just 1 table for each series of data - tick, intraday and daily. I have an event calendar that assigns special meaning to certain days. I store historical business dates so that I dont have to deal with re-calculating holidays, early closes etc.
I backup my db each night. Its a SQL Server database, not my first choice but speed of development was a priority so I decided to go with everything Microsoft. Surprisingly, I find no issues with performance. The initial load time for about 16 charts x 4 different series (and each series is a calculated continuous curve) is under 2 minutes. No issues with real-time charting either (my charts are created with ChartDirector for .NET). I have 8-12 years of daily and intra-day data. About 1 year of tick data. And for each day, I have 3 forward months (ticks) to 36 forward months (time) of data. My database size is about 15 GB. So far, I havent felt the need to separate data by a range of dates.
The attached image is a portion of my database design that might help you. My data comes from DTN so the design is specific to their format but you should get a good idea from it.
I'm not sure what exactly "insert" means, but assuming it means you want to put something on a chart, or check something in a strategy, I would recommend using a communication socket.
With a socket, you can query your "custom" front-end that is interfacing with your proprietary database, then give whatever response data you need. Data to fill a dataseries, or some economic number, field result, whatever. You can then chart it or whatever you wish in Ninja.