This is an archived documentation site for release 1.3. For the latest documentation or to access any other site features, please return to www.quantrocket.com

Installation and Deployment

Installation Guides

For installation instructions, please see the Installation tutorial for your platform.

IB account structure

Multiple logins and data concurrency

The structure of your IB account has a bearing on the speed with which you can collect real-time and historical data with QuantRocket. In short, the more IB Gateways you run, the more data you can collect. The basics of account structure and data concurrency are outlined below:

  • All interaction with the IB servers, including real-time and historical data collection, is routed through IB Gateway, IB's slimmed-down version of Trader Workstation.
  • IB imposes rate limits on the amount of historical and real-time data that can be received through IB Gateway.
  • Each IB Gateway is tied to a particular set of login credentials. Each login can be running only one active IB Gateway session at any given time.
  • However, an account holder can have multiple logins—at least two logins or possibly more, depending on the account structure. Each login can run its own IB Gateway session. In this way, an account holder can potentially run multiple instances of IB Gateway simultaneously.
  • QuantRocket is designed to take advantage of multiple IB Gateways. When running multiple gateways, QuantRocket will spread your market data requests among the connected gateways.
  • Since each instance of IB Gateway is rate limited separately by IB, the combined data throughput from splitting requests among two IB Gateways is twice that of sending all requests to one IB Gateway.
  • Each separate login must separately subscribe to the relevant market data in IB Account Management. (This refers to IB market data subscriptions, not QuantRocket exchange permissions.)

Below are a few common ways to obtain additional logins.

IB account structures are complex and vary by subsidiary, local regulations, the person opening the account, etc. The following guidelines are suggestions only and may not be applicable to your situation.

Second user login

Individual account holders can add a second login to their account. This is designed to allow you to use one login for API trading while using the other login to use Trader Workstation for manual trading or account monitoring. However, you can use both logins to collect data with QuantRocket. Note that you can't use the same login to simultaneously run Trader Workstation and collect data with QuantRocket. However, QuantRocket makes it easy to start and stop IB Gateway on a schedule, so the following is an option:

  • Login 1 (used for QuantRocket only)
    • IB Gateway always running and available for data collection and placing API orders
  • Login 2 (used for QuantRocket and Trader Workstation)
    • automatically stop IB Gateway daily at 9:30 AM
    • Run Trader Workstation during trading session for manual trading/account monitoring
    • automatically start IB Gateway daily at 4:00 PM so it can be used for overnight data collection

Advisor/Friends and Family accounts

An advisor account or the similarly structured Friends and Family account offers the possibility to obtain additional logins. Even an individual trader can open a Friends and Family account, in which they serve as their own advisor. The account setup is as follows:

  • Master/advisor account: no trading occurs in this account. The account is funded only with enough money to cover market data costs. This yields 1 IB Gateway login.
  • Master/advisor second user login: like an individual account, the master account can create a second login, subscribe to market data with this login, and use it for data collection.
  • Client account: this is main trading account where the trading funds are deposited. This account receives its own login (for 3 total). By default this account does not having trading permissions, but you can enable client trading permissions via the master account, then subscribe to market data in the client account and begin using the client login to run another instance of IB Gateway. (Note that it's not possible to add a second login for a client account.)

If you have other accounts such as retirement accounts, you can add them as additional client accounts and obtain additional logins.

Paper trading accounts

Each IB account holder can enable a paper trading account for simulated trading. You can share market data with your paper account and use the paper account login with QuantRocket to collect data, as well as to paper trade your strategies. You don't need to switch to using your live account until you're ready for live trading (although it's also fine to use your live account login from the start).

Note that, due to restrictions on market data sharing, it's not possible to run IB Gateway using the live account login and corresponding paper account login at the same time. If you try, one of the sessions will disconnect the other session.

IB market data permissions

To collect IB data using QuantRocket, you must subscribe to the relevant market data in your IB account. In IB Account Management, click on Settings > User Settings > Market Data Subscriptions:

IB market data nav

Click the edit icon then select and confirm the relevant subscriptions:

IB market data

Market data for paper accounts

IB paper accounts do not directly subscribe to market data. Rather, to access market data using your IB paper account, subscribe to the data in your live account and share it with your paper account. Log in to IB Account Management with your live account login and go to Settings > Account Settings > Paper Trading Account:

IB paper trading nav

Then select the option to share your live account's market data with your paper account:

IB paper trading

IB Gateway

QuantRocket connects to IB's servers through IB Gateway, IB's lightweight alternative to Trader Workstation. You can run one or more IB Gateway services through QuantRocket, where each gateway instance is associated with a different username and password.

Start/stop IB Gateway

QuantRocket's IB Gateway service utilizes IBC, a popular tool for automating the startup and shutdown of IB Gateway or Trader Workstation. IBC is best suited for running a single, manually-configured instance of IB Gateway or Trader Workstation on a desktop (i.e. non-headless) computer. By running IBC inside a Docker service, QuantRocket adds extra functionality that allows for cloud as well as local deployments: automated configuration; headless installation with VNC access for troubleshooting; and the ability to run and control multiple IB Gateway instances via a REST API.

Launchpad is the name of the QuantRocket service used for launching and stopping IB Gateway. You can check the current status of your IB Gateway services (QuantRocket services use this endpoint to determine which gateways to connect to when requesting market data):

$ quantrocket launchpad status
ibg1: running
ibg2: running
ibg3: stopped
>>> from quantrocket.launchpad import list_gateway_statuses
>>> list_gateway_statuses()
{
    u'ibg1': u'running',
    u'ibg2': u'running',
    u'ibg3': u'stopped'
}
$ curl -X GET 'http://houston/launchpad/gateways'
{
    "ibg1": "running",
    "ibg2": "running",
    "ibg3": "stopped"
}

Although IB Gateway is advertised as not having to be restarted once a day like Trader Workstation, it's not unusual for IB Gateway to display unexpected behavior (such as not returning market data when requested) which is then resolved simply by restarting IB Gateway. Therefore you might find it beneficial to restart your gateways from time to time, which you could do via countdown, QuantRocket's cron service:

# Restart IB Gateways nightly at 1AM
0 1 * * * quantrocket launchpad stop --wait && quantrocket launchpad start

Or, perhaps you use one of your IB logins during the day to monitor the market using Trader Workstation, but in the evenings you'd like to use this login to add concurrency to your historical data downloads. You could start and stop the IB Gateway service in conjunction with the download:

# Download data in the evenings using all logins, but then disconnect from ibg2
30 17 * * 1-5 quantrocket launchpad start --wait --gateways ibg2 && quantrocket history collect "nasdaq_eod" && quantrocket launchpad stop --gateways ibg2

Market data permission file

Generally, loading your market data permissions into QuantRocket is only necessary when you are running multiple IB Gateway services with different market data permissions for each.

To retrieve market data from IB, you must subscribe to the appropriate market data subscriptions in IB Account Management. QuantRocket can't identify your subscriptions via API, so you must tell QuantRocket about your subscriptions by loading a YAML configuration file. If you don't load a configuration file, QuantRocket will assume you have market data permissions for any data you request through QuantRocket. If you only run one IB Gateway service, this is probably sufficient and you can skip the configuration file. However, if you run multiple IB Gateway services with separate market data permissions for each, you will probably want to load a configuration file so QuantRocket can route your requests to the appropriate IB Gateway service. You should also update your configuration file whenever you modify your market data permissions in IB Account Management.

QuantRocket looks for a market data permission file called quantrocket.launchpad.permissions.yml in the top-level of the Jupyter file browser (that is, /codeload/quantrocket.launchpad.permissions.yml). The format of the YAML file is shown below:

# each top-level key is the name of an IB Gateway service
ibg1:
    # list the exchanges, by security type, this gateway has permission for
    marketdata:
        STK:
            - NYSE
            - ISLAND
            - TSEJ
        FUT:
            - GLOBEX
            - OSE
        CASH:
            - IDEALPRO
    # list the research services this gateway has permission for
    # (options: reuters, wsh)
    research:
        - reuters
        - wsh
# if you have multiple IB Gateway services, include a section for each
ibg2:
    marketdata:
        STK:
            - NYSE

When you create or edit this file, QuantRocket will detect the change and load the configuration. It's a good idea to have flightlog open when you do this. If the configuration file is valid, you'll see a success message:

2018-08-12 09:39:31 quantrocket.launchpad: INFO Successfully loaded /codeload/quantrocket.launchpad.permissions.yml

If the configuration file is invalid, you'll see an error message:

2018-08-12 09:46:46 quantrocket.launchpad: ERROR Could not load /codeload/quantrocket.launchpad.permissions.yml:
2018-08-12 09:46:46 quantrocket.launchpad: ERROR unknown key(s) for service ibg1: marketdata-typo

You can also dump out the currently loaded config to confirm it is as you expect:

$ quantrocket launchpad config
ibg1:
  marketdata:
    CASH:
    - IDEALPRO
    FUT:
    - GLOBEX
    - OSE
    STK:
    - NYSE
    - ISLAND
    - TSEJ
  research:
  - reuters
  - wsh
ibg2:
  marketdata:
    STK:
    - NYSE
>>> from quantrocket.launchpad import get_launchpad_config
>>> get_launchpad_config()
{
    'ibg1': {
        'marketdata': {
            'CASH': [
                'IDEALPRO'
            ],
            'FUT': [
                'GLOBEX',
                'OSE'
            ],
            'STK': [
                'NYSE',
                'ISLAND',
                'TSEJ'
            ]
        },
        'research': [
            'reuters',
            'wsh'
        ]
    },
    'ibg2': {
        'marketdata': {
            'STK': [
                'NYSE'
            ]
        }
    }
 }
$ curl -X GET 'http://houston/launchpad/config'
{
    "ibg1": {
        "marketdata": {
            "CASH": [
                "IDEALPRO"
            ],
            "FUT": [
                "GLOBEX",
                "OSE"
            ],
            "STK": [
                "NYSE",
                "ISLAND",
                "TSEJ"
            ]
        },
        "research": [
            "reuters",
            "wsh"
        ]
    },
    "ibg2": {
        "marketdata": {
            "STK": [
                "NYSE"
            ]
        }
    }
 }

IB Gateway GUI

Normally you won't need to access the IB Gateway GUI. However, you might need access to troubleshoot a login issue, or if you've enabled two-factor authentication for IB Gateway.

To allow access to the IB Gateway GUI, QuantRocket uses NoVNC, which uses the WebSockets protocol to support VNC connections in the browser. First start IB gateway if it's not already running:

$ quantrocket launchpad start -g ibg1 --wait
ibg1:
  status: running

To open an IB Gateway GUI connection in your browser, click the Commands menu in JupyterLab, search for "QuantRocket", and click "IB Gateway GUI". The IB Gateway GUI will open in a new window (make sure your browser doesn't block the pop-up).

IB GUI

To quit the VNC session but leave IB Gateway running, simply close your browser tab.

For improved security for cloud deployments, QuantRocket doesn't directly expose any VNC ports to the outside. By proxying VNC connections through houston using NoVNC, such connections are protected by Basic Auth and SSL, just like every other request sent through houston.

IB Gateway log files

If you need to send your IB Gateway log files to IB for troubleshooting, you can use the IB Gateway GUI to export the log files to the Docker filesystem, then copy them to your local filesystem.

  1. With IB Gateway running, open the GUI.
  2. In the IB Gateway GUI, click File > Gateway Logs, and select the day you're interested in.
  3. For small logs, you can view the logs directly in IB Gateway and copy them to your clipboard.
  4. For larger logs, click Export Logs or Export Today Logs. A file browser will open, showing the filesystem inside the Docker container.
  5. Export the log file to an easy-to-find location such as /tmp/ibgateway-exported-logs.txt.
  6. From the host machine, copy the exported logs from the Docker container to your local filesystem. For ibg1 logs saved to the above location, the command would be:
$ docker cp quantrocket_ibg1_1:/tmp/ibgateway-exported-logs.txt ibgateway-exported-logs.txt

Connect from other applications

If you run other applications, you can connect them to your QuantRocket deployment for the purpose of querying data, submitting orders, etc.

To utilize the Python API and/or CLI from outside of QuantRocket, install the client on the application or system you wish to connect from:

$ pip install quantrocket-client

Then, set environment variables to tell the client how to connect to your QuantRocket deployment. For a cloud deployment, this means providing the deployment URL and credentials:

$ # Linux/MacOS syntax:
$ export HOUSTON_URL=https://quantrocket.123capital.com
$ export HOUSTON_USERNAME=myusername
$ export HOUSTON_PASSWORD=mypassword

$ # Windows syntax (restart PowerShell afterwards for change to take effect):
$ [Environment]::SetEnvironmentVariable("HOUSTON_URL", "https://quantrocket.123capital.com", "User")
$ [Environment]::SetEnvironmentVariable("HOUSTON_USERNAME", "myusername", "User")
$ [Environment]::SetEnvironmentVariable("HOUSTON_PASSWORD", "mypassword", "User")

For connecting to a local deployment, only the URL is needed:

$ # Linux/MacOS syntax:
$ export HOUSTON_URL=http://localhost:1969

$ # Windows syntax (restart PowerShell afterwards for change to take effect):
$ [Environment]::SetEnvironmentVariable("HOUSTON_URL", "http://localhost:1969", "User")
Environment variable syntax varies by operating system. Don't forget to make your environment variables persistent by adding them to .bashrc (Linux) or .profile (MacOS) and sourcing it (for example source ~/.bashrc), or restarting PowerShell (Windows).

Finally, test that it worked:

$ quantrocket houston ping
msg: hello from houston
>>> from quantrocket.houston import ping
>>> ping()
{u'msg': u'hello from houston'}
$ curl -u myusername:mypassword https://quantrocket.123capital.com/ping
{"msg": "hello from houston"}

To connect from applications running languages other than Python, you can skip the client installation and use the HTTP API directly.

Multi-user deployments

Hedge funds and other multi-user organizations can benefit from the ability to run more than one QuantRocket deployment. You can deploy QuantRocket to two or in some cases more than two computers or cloud servers, depending on your subscription plan.

The user interface for QuantRocket is JupyterLab, which is best suited for use by a single user at a time. While it is possible for multiple users to log in to the same QuantRocket cloud deployment, it is usually not ideal because they will be working in a shared JupyterLab environment, with a shared filesytem and notebooks, shared JupyterLab terminals and kernels, and shared compute resources. This will likely lead to stepping on each other's toes.

For hedge funds, a recommended deployment strategy is to run a primary deployment for data collection and live trading, and one or more research deployments (depending on subscription) for research and backtesting.

Deployed toHow manyConnects to IB GatewayUsed forUsed by
Primary deploymentCloud1YesData collection, live tradingSys admin / owner / manager
Research deployment(s)Cloud or local1 or moreNoResearch and backtestingQuant researchers

Collect data on the primary deployment and push it to S3. Once pushed, deep historical data can optionally be purged from the primary deployment, retaining only enough historical data to run live trading. Then, selectively pull databases from S3 onto the research deployment(s), where researchers analyze the data and run backtests.

Research deployments can be hosted in the cloud or run on the researcher's local workstation.

Each researcher's code, notebooks, and JupyterLab environment are isolated from those of other researchers. The code can be pushed to separate Git repositories, with sharing and access control managed on the Git repositories.

You can only run IB Gateway on one deployment at a time, due to restrictions imposed by IB. With the deployment strategy above, this is not a problem because IB Gateway only runs on the primary deployment.

Universe Selection

Trading starts with the selection of your trading universe. Many trading platforms assume you already have a list of symbols you want to trade and expect you to hand-enter them into the platform. With IB supporting dozens of global exchanges and thousands upon thousands of individual listings, QuantRocket doesn't assume you already know the ticker symbols of every instrument you might want to trade. QuantRocket makes it easy to retrieve all available listings and flexibly group them into universes that make sense for your trading strategies.

Collect listings

First, decide which exchange(s) you want to work with. You can view exchange listings on the IB website or use QuantRocket to summarize the IB website by security type:
$ quantrocket master exchanges --regions asia --sec-types STK
STK:
  Australia:
  - ASX
  - CHIXAU
  Hong Kong:
  - SEHK
  - SEHKNTL
  - SEHKSZSE
  India:
  - NSE
  Japan:
  - CHIXJ
  - JPNNEXT
  - TSEJ
  Singapore:
  - SGX
>>> from quantrocket.master import list_exchanges
>>> list_exchanges(regions=["asia"], sec_types=["STK"])
{'STK': {'Australia': ['ASX', 'CHIXAU'],
         'Hong Kong': ['SEHK', 'SEHKNTL', 'SEHKSZSE'],
         'India': ['NSE'],
         'Japan': ['CHIXJ', 'JPNNEXT', 'TSEJ'],
         'Singapore': ['SGX']}}
$ curl 'http://houston/master/exchanges?regions=asia&sec_types=STK'
{"STK": {"Australia": ["ASX", "CHIXAU"], "Hong Kong": ["SEHK", "SEHKNTL", "SEHKSZSE"], "India": ["NSE"], "Japan": ["CHIXJ", "JPNNEXT", "TSEJ"], "Singapore": ["SGX"]}}
Let's download contract details for all stock listings on the Hong Kong Stock Exchange:
$ quantrocket master listings --exchange SEHK --sec-types STK
status: the listing details will be collected asynchronously
>>> from quantrocket.master import collect_listings
>>> collect_listings(exchange="SEHK", sec_types=["STK"])
{'status': 'the listing details will be collected asynchronously'}
$ curl -X POST 'http://houston/master/listings?exchange=SEHK&sec_types=STK'
{"status": "the listing details will be collected asynchronously"}
QuantRocket uses the IB website to collect all symbols for the requested exchange then downloads contract details from the IB API. The download runs asynchronously; check Papertrail or use the CLI to monitor the progress:.
$ quantrocket flightlog stream --hist 5
12:07:40 quantrocket.master: INFO Collecting SEHK STK listings from IB website
12:08:29 quantrocket.master: INFO Requesting details for 2220 SEHK listings found on IB website
12:10:06 quantrocket.master: INFO Saved 2215 SEHK listings to securities master database
The number of listings collected from the IB website might be larger than the number of listings actually saved to the database. This is because the IB website lists all symbols that trade on a given exchange, even if the exchange is not the primary listing exchange. For example, the primary listing exchange for Alcoa (AA) is NYSE, but the IB website also lists Alcoa under the BATS exchange because Alcoa also trades on BATS (and many other US exchanges). QuantRocket downloads and saves Alcoa's contract details when you collect NYSE listings, not when you collect BATS listings. For futures, the number of contracts saved to the database will typically be larger than the number of listings found on the IB website because the website only lists underlyings but QuantRocket saves all available expiries for each underlying.

Define universes

Once you've collected listings that interest you, you can group them into meaningful universes. Universes provide a convenient way to refer to and manipulate large groups of securities when collecting historical data, running a trading strategy, etc. You can create universes based on exchanges, security types, sectors, liquidity, or any criteria you like.

There are different ways to create a universe. You can download a CSV of securities, manually pare it down to the desired securities, and create the universe from the edited list:

$ quantrocket master get --exchanges SEHK --outfile hongkong_securities.csv
$ # edit the CSV, then:
$ quantrocket master universe "hongkong" --infile hongkong_securities_edited.csv
code: hongkong
inserted: 2216
provided: 2216
total_after_insert: 2216
>>> from quantrocket.master import download_master_file, create_universe
>>> download_master_file("hongkong_securities.csv", exchanges=["SEHK"])
>>> # edit the CSV, then:
>>> create_universe("hongkong", infilepath_or_buffer="hongkong_securities_edited.csv")
{'code': 'hongkong',
 'inserted': 2216,
 'provided': 2216,
 'total_after_insert': 2216}
$ curl -X GET 'http://houston/master/securities.csv?exchanges=SEHK' > hongkong_securities.csv
$ # edit the CSV, then:
$ curl -X PUT 'http://houston/master/universes/hongkong' --upload-file hongkong_securities_edited.csv
{"code": "hongkong", "provided": 2216, "inserted": 2216, "total_after_insert": 2216}

Using the CLI, you can create a universe in one-line by piping the downloaded CSV to the universe command:

$ quantrocket master get --exchanges SEHK --sectors "Financial" | quantrocket master universe "hongkong-fin" --infile -
code: hongkong-fin
inserted: 416
provided: 416
total_after_insert: 416

You can also create a universe from existing universes:

$ quantrocket master universe "asx" --from-universes "asx-sml" "asx-mid" "asx-lrg"
code: asx
inserted: 1604
provided: 1604
total_after_insert: 1604
>>> from quantrocket.master import create_universe
>>> create_universe("asx", from_universes=["asx-sml", "asx-mid", "asx-lrg"])
{'code': 'asx',
 'inserted': 1604,
 'provided': 1604,
 'total_after_insert': 1604}
$ curl -X PUT 'http://houston/master/universes/asx?from_universes=asx-sml&from_universes=asx-mid&from_universes=asx-lrg'
{"code": "asx", "provided": 1604, "inserted": 1604, "total_after_insert": 1604}

Filter by securities master fields with csvgrep

You can filter securities master queries by a variety of fields including Symbol, Exchange, Currency, Sector, and more. (Run quantrocket master get -h to see filtering options.) However, sometimes you may want to filter by a field that is not exposed by the API. From a terminal, you can use csvgrep for this purpose. For example, NASDAQ stocks are divided into the NMS (National Market System) and SCM (SmallCap Market) listing tiers, which are stored in the TradingClass field. To create a separate universe for each listing tier:

$ quantrocket master get -e 'NASDAQ' | csvgrep --columns 'TradingClass' --match 'NMS' | quantrocket master universe 'nasdaq-nms' -f -
code: nasdaq-nms
inserted: 2252
provided: 2252
total_after_insert: 2252
$ quantrocket master get -e 'NASDAQ' | csvgrep --columns 'TradingClass' --match 'SCM' | quantrocket master universe 'nasdaq-scm' -f -
code: nasdaq-scm
inserted: 778
provided: 778
total_after_insert: 778

Or save a CSV of OTC stocks, excluding the "NOINFO" trading class:

$ quantrocket master get -e 'PINK' | csvgrep --columns 'TradingClass' --match 'NOINFO' --invert-match > pink_with_info.csv

Define universes by fundamental data availability

If you want to limit a universe to stocks with fundamental data, the best approach is to create a universe comprising the entire pool of relevant securities, collect the needed data for this universe, then create the sub-universe.

Suppose we've collected all NYSE stock listings and want to create a universe of all NYSE stocks with Reuters estimates available. First, define a universe of all NYSE stocks and collect estimates:

$ quantrocket master get -e 'NYSE' -t 'STK' | quantrocket master universe 'nyse-stk' -f -
code: nyse-stk
inserted: 3109
provided: 3109
total_after_insert: 3109
$ quantrocket fundamental collect-estimates --universes 'nyse-stk'
status: the fundamental data will be collected asynchronously

Wait for the fundamental data to be collected (monitor flightlog for status). Then, since a universe can be created from any file with a ConId column, simply download a file of estimates for the desired codes and re-upload the file to create the universe:

$ quantrocket fundamental estimates 'BVPS' 'EPS' 'NAV' 'ROE' 'ROA' -u 'nyse-stk' | quantrocket master universe 'nyse-stk-with-estimates' -f -
code: nyse-stk-with-estimates
inserted: 1957
provided: 1957
total_after_insert: 1957

Define universes by dollar volume

Alternatively, suppose we want to create 3 universes - smallcaps, midcaps, and largecaps - based on the 90-day average dollar volume of NYSE stocks. First, create a history database and collect historical data for all NYSE stocks (see the Historical Data section for more detail).

$ quantrocket history create-db 'nyse-eod' --bar-size '1 day' --universes 'nyse-stk'
status: successfully created quantrocket.history.nyse-eod.sqlite
$ quantrocket history collect 'nyse-eod'
status: the historical data will be collected asynchronously
Once the historical data has been collected (monitor flightlog for status), you can use pandas and the Python client to determine average dollar volume and create your universes. First, query the history database and load into pandas:
>>> from quantrocket.history import get_historical_prices
>>> from quantrocket.master import create_universe
>>> import io
>>> prices = get_historical_prices("nyse-eod", fields=["Close", "Volume"])

Next we calculate daily dollar volume and take a 90-day average:

>>> closes = prices.loc["Close"]
>>> volumes = prices.loc["Volume"]
>>> dollar_volumes = closes * volumes
>>> avg_dollar_volumes = dollar_volumes.rolling(window=90).mean()
>>> # we'll make our universes based on the latest day's averages
>>> avg_dollar_volumes = avg_dollar_volumes.iloc[-1]
>>> avg_dollar_volumes.describe()
Out[60]:
count    2.255000e+03
mean     3.609773e+07
std      9.085866e+07
min      3.270559e+04
25%      1.058080e+06
50%      6.229675e+06
75%      3.399090e+07
max      1.719344e+09
Name: 2017-08-15 00:00:00, dtype: float64

Let's make universes of $1-5M, $5-25M, and $25M+:

>>> sml = avg_dollar_volumes[(avg_dollar_volumes >= 1000000) & (avg_dollar_volumes < 5000000)]
>>> mid = avg_dollar_volumes[(avg_dollar_volumes >= 5000000) & (avg_dollar_volumes < 25000000)]
>>> lrg = avg_dollar_volumes[avg_dollar_volumes >= 25000000]

The DataFrame indexes contain the conids which are needed to make the universes, so we write the DataFrames to in-memory CSVs and pass the CSVs to the master service:

>>> f = io.StringIO()
>>> sml.to_csv(f, header=True)
>>> create_universe("nyse-sml", infilepath_or_buffer=f)
{'code': 'nyse-sml',
 'inserted': 509,
 'provided': 509,
 'total_after_insert': 509}
>>> f = io.StringIO()
>>> mid.to_csv(f, header=True)
>>> create_universe("nyse-mid", infilepath_or_buffer=f)
 {'code': 'nyse-mid',
  'inserted': 530,
  'provided': 530,
  'total_after_insert': 530}
>>> f = io.StringIO()
>>> lrg.to_csv(f, header=True)
>>> create_universe("nyse-lrg", infilepath_or_buffer=f)
{'code': 'nyse-lrg',
'inserted': 665,
'provided': 665,
'total_after_insert': 665}

On a side note, now that you've created different universes for different market caps, a typical workflow might involve creating a history database for each universe. As described more fully in the Historical Data documentation, you can seed your databases for each market cap segment from the historical data you've already collected, saving you the trouble of re-collecting the data from scratch.

$ quantrocket history create-db 'nyse-sml-eod' --bar-size '1 day' --universes 'nyse-sml'
status: successfully created quantrocket.history.nyse-sml-eod.sqlite
$ quantrocket history get 'nyse-eod' --universes 'nyse-sml' | quantrocket history load 'nyse-sml-eod'
db: nyse-sml-eod
loaded: 572081

See the Historical Data documentation for more details on copying data from one history database to another.

Futures rollover rules

You can define rollover rules for the futures contracts you trade, and QuantRocket will automatically calculate the rollover date for each expiry and store it in the securities master database. Your rollover rules are used to determine the front month contract when stitching together continuous futures contracts and when automating position rollover.

The format of the rollover rules configuration file is shown below:

# quantrocket.master.rollover.yml

# each top level key is an exchange code
GLOBEX:
  # each second-level key is an underlying symbol
  ES:
    # the rollrule key defines how to derive the rollover date
    # from the expiry/LastTradeDate; the arguments will be passed
    # to bdateutil.relativedelta. For valid args, see:
    # https://dateutil.readthedocs.io/en/stable/relativedelta.html
    # https://github.com/ryanss/python-bdateutil#documentation
    rollrule:
      # roll 8 calendar days before expiry
      days: -8
    # if the same rollover rules apply to numerous futures contracts,
    # you can save typing and enter them all at once under the same_for key
    same_for:
      - NQ
      - RS
      - YM
  MXP:
    # If you want QuantRocket to ignore certain contract months,
    # you can specify the months you want (using numbers not letters)
    # Only the March, June, Sept, and Dec MXP contracts are liquid
    only_months:
      - 3
      - 6
      - 9
      - 12
    rollrule:
      # roll 7 calendar days before expiry
      days: -7
    same_for:
      - GBP
      - JPY
      - AUD
  HE:
    rollrule:
      # roll on 27th day of month prior to expiry month
      months: -1
      day: 27
NYMEX:
  RB:
    rollrule:
      # roll 2 business days before expiry
      bdays: -2

You can load your rollover rules into a running deployment as follows:

$ quantrocket master rollrules /path/to/quantrocket.master.rollover.yml
status: the config will be loaded asynchronously
>>> from quantrocket.master import load_rollrules_config
>>> load_rollrules_config("/path/to/quantrocket.master.rollover.yml")
{u'status': u'the config will be loaded asynchronously'}
$ curl -X PUT 'http://houston/master/config/rollover' --upload-file /path/to/quantrocket.master.rollover.yml
{"status": "the config will be loaded asynchronously"}

The rollover rules configuration file, if you upload one, is stored in QuantRocket as quantrocket.master.rollover.yml. This is the filename you should use if you wish to store the configuration file in a Git repository and have QuantRocket automatically load it at the time of deployment using the codeload service.

You can query your rollover dates:

$ quantrocket master get --exchanges GLOBEX --symbols ES --sec-types FUT --fields Symbol LastTradeDate RolloverDate | csvlook -I
| ConId     | Symbol | LastTradeDate       | RolloverDate |
| --------- | ------ | ------------------- | ------------ |
| 177525433 | ES     | 2016-03-18T00:00:00 | 2016-03-10   |
| 187532577 | ES     | 2016-06-17T00:00:00 | 2016-06-09   |
| 197307551 | ES     | 2016-09-16T00:00:00 | 2016-09-08   |
| 206848474 | ES     | 2016-12-16T00:00:00 | 2016-12-08   |
| 215465490 | ES     | 2017-03-17T00:00:00 | 2017-03-09   |
| 225652200 | ES     | 2017-06-16T00:00:00 | 2017-06-08   |
| 236950077 | ES     | 2017-09-15T00:00:00 | 2017-09-07   |
| 247950613 | ES     | 2017-12-15T00:00:00 | 2017-12-07   |
| 258973438 | ES     | 2018-03-16T00:00:00 | 2018-03-08   |
>>> from quantrocket.master import download_master_file
>>> import io
>>> import pandas as pd
>>> f = io.StringIO()
>>> download_master_file(f, exchanges=["GLOBEX"], symbols=["ES"], sec_types=["FUT"], fields=["Symbol", "LastTradeDate", "RolloverDate"])
>>> df = pd.read_csv(f)
>>> df.tail()
       ConId Symbol LastTradeDate RolloverDate
8   236950077     ES    2017-09-15   2017-09-07
9   247950613     ES    2017-12-15   2017-12-07
10  258973438     ES    2018-03-16   2018-03-08
11  269745169     ES    2018-06-15   2018-06-07
12  279396694     ES    2018-09-21   2018-09-13
$ curl 'http://houston/master/securities.csv?exchanges=GLOBEX&symbols=ES&sec_types=FUT&fields=Symbol&fields=LastTradeDate&fields=RolloverDate'
177525433,ES,2016-03-18T00:00:00,2016-03-10
187532577,ES,2016-06-17T00:00:00,2016-06-09
197307551,ES,2016-09-16T00:00:00,2016-09-08
206848474,ES,2016-12-16T00:00:00,2016-12-08
215465490,ES,2017-03-17T00:00:00,2017-03-09
225652200,ES,2017-06-16T00:00:00,2017-06-08
236950077,ES,2017-09-15T00:00:00,2017-09-07
247950613,ES,2017-12-15T00:00:00,2017-12-07
258973438,ES,2018-03-16T00:00:00,2018-03-08
Or query only the front month contract:
$ quantrocket master get --exchanges GLOBEX --symbols ES --sec-types FUT --frontmonth --pretty
           ConId = 236950077
          Symbol = ES
         SecType = FUT
             Etf = 0
     PrimaryExchange = GLOBEX
        Currency = USD
     LocalSymbol = ESU7
    TradingClass = ES
      MarketName = ES
        LongName = E-mini S&P 500
        Timezone = America/Chicago
          Sector =
        Industry =
        Category =
         MinTick = 0.25
  PriceMagnifier = 1
MdSizeMultiplier = 1
   LastTradeDate = 2017-09-15
    RolloverDate = 2017-09-07
   ContractMonth = 201709
      Multiplier = 50
         LotSize =
        Delisted = 0
>>> f = io.StringIO()
>>> download_master_file(f, exchanges=["GLOBEX"], symbols=["ES"], sec_types=["FUT"], frontmonth=True, output="txt")
>>> print(f.getvalue())
           ConId = 236950077
          Symbol = ES
         SecType = FUT
             Etf = 0
     PrimaryExchange = GLOBEX
        Currency = USD
     LocalSymbol = ESU7
    TradingClass = ES
      MarketName = ES
        LongName = E-mini S&P 500
        Timezone = America/Chicago
          Sector =
        Industry =
        Category =
         MinTick = 0.25
  PriceMagnifier = 1
MdSizeMultiplier = 1
   LastTradeDate = 2017-09-15
    RolloverDate = 2017-09-07
   ContractMonth = 201709
      Multiplier = 50
         LotSize =
        Delisted = 0
$ curl 'http://houston/master/securities.txt?exchanges=GLOBEX&symbols=ES&sec_types=FUT&frontmonth=true'
           ConId = 236950077
          Symbol = ES
         SecType = FUT
             Etf = 0
     PrimaryExchange = GLOBEX
        Currency = USD
     LocalSymbol = ESU7
    TradingClass = ES
      MarketName = ES
        LongName = E-mini S&P 500
        Timezone = America/Chicago
          Sector =
        Industry =
        Category =
         MinTick = 0.25
  PriceMagnifier = 1
MdSizeMultiplier = 1
   LastTradeDate = 2017-09-15
    RolloverDate = 2017-09-07
   ContractMonth = 201709
      Multiplier = 50
         LotSize =
        Delisted = 0

Option chains

To collect option chains, first collect listings for the underlying securities:

$ quantrocket master listings --exchange 'NASDAQ' --sec-types 'STK' --symbols 'GOOG' 'FB' 'AAPL'
status: the listing details will be collected asynchronously
>>> from quantrocket.master import collect_listings
>>> collect_listings(exchange="NASDAQ", sec_types=["STK"], symbols=["GOOG", "FB", "AAPL"])
{'status': 'the listing details will be collected asynchronously'}
$ curl -X POST 'http://houston/master/listings?exchange=NASDAQ&sec_types=STK&symbols=GOOG&symbols=FB&symbols=AAPL'
{"status": "the listing details will be collected asynchronously"}
Then request option chains for the underlying stocks:
$ quantrocket master get -e 'NASDAQ' -t 'STK' -s 'GOOG' 'FB' 'AAPL' | quantrocket master options --infile -
status: the option chains will be collected asynchronously
>>> from quantrocket.master import download_master_file, collect_option_chains
>>> import io
>>> f = io.StringIO()
>>> download_master_file(f, exchanges=["NASDAQ"], sec_types=["STK"], symbols=["GOOG", "FB", "AAPL"])
>>> collect_option_chains(infilepath_or_buffer=f)
{'status': 'the option chains will be collected asynchronously'}
$ curl -X GET 'http://houston/master/securities.csv?exchanges=NASDAQ&sec_types=STK&symbols=GOOG&symbols=FB&symbols=AAPL' > nasdaq_mega.csv
$ curl -X POST 'http://houston/master/options' --upload-file nasdaq_mega.csv
{"status": "the option chains will be collected asynchronously"}
Once the options request has finished, you can query the options like any other security:
$ quantrocket master get -s 'GOOG' 'FB' 'AAPL' -t 'OPT' --outfile 'options.csv'
>>> from quantrocket.master import download_master_file
>>> download_master_file("options.csv", symbols=["GOOG", "FB", "AAPL"], sec_types=["OPT"])
$ curl -X GET 'http://houston/master/securities.csv?symbols=GOOG&symbols=FB&symbols=AAPL&sec_types=OPT' > options.csv
Option chains often consist of hundreds, sometimes thousands of options per underlying security. Be aware that requesting option chains for large universes of underlying securities, such as all stocks on the NYSE, can take numerous hours to complete, add hundreds of thousands of rows to the securities master database, increase the database file size by several hundred megabytes, and potentially add latency to database queries.

Maintain listings

Listings change over time and QuantRocket helps you keep your securities master database up-to-date. Your can monitor for changes to your existing listings (such as a company moving its listing from one exchange to another), you can delist securities to exclude them from your backtests and trading (without deleting them), and you can look for new listings.

Listings diffs

Security listings can change - for example, a stock might be delisted from Nasdaq and start trading OTC - and we probably want to be alerted when this happens. We can flag securities where the details as stored in our database differ from the latest details available from IB.
$ quantrocket master diff --universes "nasdaq"
status: the diff, if any, will be logged to flightlog asynchronously
>>> from quantrocket.master import diff_securities
>>> diff_securities(universes=["nasdaq"])
{'status': 'the diff, if any, will be logged to flightlog asynchronously'}
$ curl -X GET 'http://houston/master/diff?universes=nasdaq'
{"status": "the diff, if any, will be logged to flightlog asynchronously"}

If any listings have changed, they'll be logged to flightlog at the WARNING level with a description of what fields have changed. You may wish to schedule this command on your countdown service and monitor Papertrail:

Papertrail log message

Delist stocks

Perhaps a stock has moved to the pink sheets and we're not interested in it anymore. We can delist it, which will retain the data but allow it to be excluded from our backtests and trading.

$ quantrocket master delist --conid 194245757
msg: delisted conid 194245757
>>> from quantrocket.master import delist_security
>>> delist_security(conid=194245757)
{'msg': 'delisted conid 194245757'}
$ curl -X DELETE 'http://houston/master/securities?conids=194245757'
{"msg": "delisted conid 194245757"}

If you want to automate the delisting, you can run quantrocket master diff with the --delist-missing option, which delists securities that are no longer available from IB, and with the --delist-exchanges option, which delists securities associated with the exchanges you specify (note that IB uses the "VALUE" exchange as a placeholder for some delisted symbols):

$ quantrocket master diff --universes "nasdaq" --delist-missing --delist-exchanges VALUE PINK

When you delist a security, QuantRocket doesn't delete it but simply marks it as delisted so you can exclude it from your queries. If you wish, you can still include it in your queries by using the --delisted option:

$ # By default, exclude delisted securities that would otherwise match the query
$ quantrocket master get --universes "nasdaq" --outfile nasdaq_active.csv
$ # Or include delisted securities
$ quantrocket master get --universes "nasdaq" --delisted --outfile nasdaq_all.csv

Ticker symbol changes

Sometimes when a ticker symbol changes IB will preserve the conid (contract ID); in this case, to incorporate the changes into our database, we can simply collect the listing details for the symbol we care about, which will overwrite the old (stale) listing details:

$ # Look up the symbol's conid and collect the listings for just that conid
$ quantrocket master get --exchanges TSE --symbols OLD --pretty --fields ConId
ConId = 123456
$ quantrocket master listings -i 123456
status: the listing details will be collected asynchronously

However, sometimes IB will issue a new conid. In this case, if you want to continue trading the symbol, you should delist the old symbol, collect the new listing, and append the new symbol to the universe(s) you care about:

$ quantrocket master delist --exchange TSE --symbol OLD
msg: delisted conid 123456
$ quantrocket master listings --exchange TSE --symbols NEW --sec-types STK
$ # check flightlog and wait for listing download to complete, then:
$ quantrocket master get -e TSE -s NEW -t STK | quantrocket master universe "canada" --append --infile -

The above examples expect you to take action in response to individual ticker changes, but what if your universes consist of thousands of stocks and you don't want to deal with them individually? Use quantrocket master diff --delist-missing to automate the delisting of symbols that go missing, as described in the previous section, and use quantrocket master listings to periodically collect any listings that might belong in your universe(s), as described in the next section. If any symbols go missing due to ticker changes that cause IB to issue a new conid, you'll pick up the new listings the next time you run quantrocket master listings.

Add new listings

What if you want to look for new listings that IB has added since your initial universe creation and add them to your universe? First, collect all listings again from IB:
$ quantrocket master listings --exchange SEHK --sec-types STK
status: the listing details will be collected asynchronously
>>> from quantrocket.master import collect_listings
>>> collect_listings(exchange="SEHK", sec_types=["STK"])
{'status': 'the listing details will be collected asynchronously'}
$ curl -X POST 'http://houston/master/listings?exchange=SEHK&sec_types=STK'
{"status": "the listing details will be collected asynchronously"}
You can see what's new by excluding what you already have:
$ quantrocket master get --exchanges SEHK --exclude-universes "hongkong" --outfile new_hongkong_securities.csv
>>> from quantrocket.master import download_master_file
>>> download_master_file("new_hongkong_securities.csv", exchanges=["SEHK"], exclude_universes=["hongkong"])
$ curl -X GET 'http://houston/master/securities.csv?exchanges=SEHK&exclude_universes=hongkong' > new_hongkong_securities.csv
If you like what you see, you can then append the new listings to your universe:
$ quantrocket master universe "hongkong" --infile new_hongkong_securities.csv
code: hongkong
inserted: 10
provided: 10
total_after_insert: 2226
>>> from quantrocket.master import create_universe
>>> create_universe("hongkong", infilepath_or_buffer="new_hongkong_securities.csv", append=True)
{'code': 'hongkong',
 'inserted': 10,
 'provided': 10,
 'total_after_insert': 2226}
$ curl -X PATCH 'http://houston/master/universes/hongkong' --upload-file new_hongkong_securities.csv
{"code": "hongkong", "provided": 10, "inserted": 10, "total_after_insert": 2226}
For futures, IB provides several years of future expiries. From time to time, you should collect the listings again for your futures exchange(s) in order to collect the new expiries, then add them to any universes you may wish to include them in.

Historical Data

QuantRocket makes it easy to retrieve and work with IB's abundant, global historical market data. (Appropriate IB market data subscriptions required.) Simply define your historical data requirements, and QuantRocket will retrieve data from IB according to your requirements and store it in a database for fast, flexible querying. You can create as many databases as you need for your backtesting and trading.

Create historical databases

Create a database by defining, at minimum, the bar size you want and the universe of securities to include. Suppose we've used the master service to define a universe of banking stocks on the Tokyo Stock Exchange, and now we want to collect end-of-day historical data for those stocks. First, create the database:

$ quantrocket history create-db 'japan-bank-eod' --universes 'japan-bank' --bar-size '1 day'
status: successfully created quantrocket.history.japan-bank-eod.sqlite
>>> from quantrocket.history import create_db
>>> create_db("japan-bank-eod", universes=["japan-bank"], bar_size="1 day")
{'status': 'successfully created quantrocket.history.japan-bank-eod.sqlite'}
$ curl -X PUT 'http://houston/history/databases/japan-bank-eod?universes=japan-bank&bar_size=1 day'
{"status": "successfully created quantrocket.history.japan-bank-eod.sqlite"}
Then, fill up the database with data from IB:
$ quantrocket history collect 'japan-bank-eod'
status: the historical data will be collected asynchronously
>>> from quantrocket.history import collect_history
>>> collect_history("japan-bank-eod")
{'status': 'the historical data will be collected asynchronously'}
$ curl -X POST 'http://houston/history/queue?codes=japan-bank-eod'
{"status": "the historical data will be collected asynchronously"}
QuantRocket will first query the IB API to determine how far back historical data is available for each security, then query the IB API again to collect the data for that date range. Depending on the bar size and the number of securities in the universe, collecting data can take from several minutes to several hours. If you're running multiple IB Gateway services, QuantRocket will spread the requests among the services to speed up the process. Based on how quickly the IB API is responding to requests, QuantRocket will periodically estimate how long it will take to collect the data. You can monitor flightlog via the command line or Papertrail to track progress:
$ quantrocket flightlog stream
2017-08-22 13:24:09 quantrocket.history: INFO [japan-bank-eod] Determining how much history is available from IB for japan-bank-eod
2017-08-22 13:25:45 quantrocket.history: INFO [japan-bank-eod] Collecting history from IB for japan-bank-eod
2017-08-22 13:26:11 quantrocket.history: INFO [japan-bank-eod] Expected remaining runtime to collect japan-bank-eod history based on IB response times so far: 0:23:11
2017-08-22 13:55:00 quantrocket.history: INFO [japan-bank-eod] Saved 468771 total records for 85 total securities to quantrocket.history.japan-bank-eod.sqlite
In addition to bar size and universe(s), you can optionally define the type of data you want (for example, trades, bid/ask, midpoint, etc.), a fixed start date instead of "as far back as possible", whether to include trades from outside regular trading hours, whether to use consolidated prices or primary exchange prices, and more. For a complete list of options, view the API Reference. As you become interested in new exchanges or want to test new ideas, you can keep adding as many new databases with as many different configurations as you like.
Once you've created a database, you can't edit the configuration; you can only add new databases. If you made a mistake or no longer need an old database, you can use the CLI to drop the database and its associated config.

Update historical data

After you create a history database and run the initial data collection, you will often want to keep the data up-to-date over time. To do so, simply collect the data again:

$ quantrocket history collect 'japan-bank-eod'
status: the historical data will be collected asynchronously
>>> from quantrocket.history import collect_history
>>> collect_history("japan-bank-eod")
{'status': 'the historical data will be collected asynchronously'}
$ curl -X POST 'http://houston/history/queue?codes=japan-bank-eod'
{"status": "the historical data will be collected asynchronously"}

QuantRocket will bring the database current, appending new data to what you already have. The update process will run much faster than the initial data collection due to collecting fewer records.

If QuantRocket detects that a split or other adjustment has occurred, it will not only collect the new data but replace the existing data for that security.

You can use the countdown service to schedule your databases to be updated regularly.

Historical data collection queue

You can queue as many historical data requests as you wish, and they will be processed in sequential order, one at a time:

$ quantrocket history collect 'aus-lrg-eod' 'singapore-15min' 'germany-1hr-bid-ask'
status: the historical data will be collected asynchronously
>>> from quantrocket.history import collect_history
>>> collect_history(["aus-lrg-eod", "singapore-15min", "germany-1hr-bid-ask"])
{'status': 'the historical data will be collected asynchronously'}
$ curl -X POST 'http://houston/history/queue?codes=aus-lrg-eod&codes=singapore-15min&codes=germany-1hr-bid-ask'
{"status": "the historical data will be collected asynchronously"}
You can view the current queue:
$ quantrocket history queue
priority: []
standard:
- aus-lrg-eod
- singapore-15min
- germany-1hr-bid-ask
>>> quantrocket.history import get_history_queue
>>> get_history_queue()
{'priority': [],
 'standard': ['aus-lrg-eod', 'singapore-15min', 'germany-1hr-bid-ask']}
$ curl -X GET 'http://houston/history/queue'
{"priority": [], "standard": ["aus-lrg-eod", "singapore-15min", "germany-1hr-bid-ask"]}
Maybe you're regretting that the Germany request is at the end of the queue because you'd like to get that data first and start analyzing it. You can cancel the requests in front of it then add them to the end of the queue:
$ quantrocket history cancel 'aus-lrg-eod' 'singapore-15min'
priority: []
standard:
- germany-1hr-bid-ask
$ quantrocket history collect 'aus-lrg-eod' 'singapore-15min'
status: the historical data will be collected asynchronously
$ quantrocket history queue
priority: []
standard:
- germany-1hr-bid-ask
- aus-lrg-eod
- singapore-15min
>>> from quantrocket.history import get_history_queue, cancel_collections, collect_history
>>> cancel_collections(codes=["aus-lrg-eod", "singapore-15min"])
{'priority': [],
 'standard': ['germany-1hr-bid-ask']}
>>> collect_history(["aus-lrg-eod", "singapore-15min"])
{'status': 'the historical data will be collected asynchronously'}
>>> get_history_queue()
{'priority': [],
 'standard': ['germany-1hr-bid-ask', 'aus-lrg-eod', 'singapore-15min']}
$ curl -X DELETE 'http://houston/history/queue?codes=aus-lrg-eod&codes=singapore-15min'
{"priority": [], "standard": ["germany-1hr-bid-ask"]}
$ curl -X POST 'http://houston/history/queue?codes=aus-lrg-eod&codes=singapore-15min'
{"status": "the historical data will be collected asynchronously"}
$ curl -X GET 'http://houston/history/queue'
{"priority": [], "standard": ["germany-1hr-bid-ask", "aus-lrg-eod", "singapore-15min"]}

There's another way to control queue priority: QuantRocket provides a standard queue and a priority queue. The standard queue will only be processed when the priority queue is empty. This can be useful when you're trying to collect a large amount of historical data for backtesting but you don't want it to interfere with daily updates to the databases you use for trading. First, schedule your daily updates on your countdown (cron) service, using the --priority flag to route them to the priority queue:

# collect some US data each weekday at 5:30 pm
30 17 * * mon-fri quantrocket history collect --priority nyse-lrg-eod nyse-mid-eod nyse-sml-eod

Then, queue your long-running requests on the standard queue:

$ quantrocket history collect nyse-15min # many symbols + smaller granularity = slower

At 5:30pm, when several requests are queued on the priority queue, the long-running request on the standard queue will pause until the priority queue is empty again, and then resume.

Split adjustments

IB adjusts its historical data for splits, so your data will be split-adjusted when you initially retrieve it into your history database. However, if a split occurs after the initial retrieval, the data that was already stored needs to be adjusted for the split. QuantRocket handles this circumstance by comparing a recent price in the database to the equivalently-timestamped price from IB. If the prices differ, this indicates either that a split has occurred or in some other way the vendor has adjusted their data since QuantRocket stored it. Regardless of the reason, QuantRocket deletes the data for that particular security and re-collects the entire history from IB, in order to make sure the database stays synced with IB.

Dividend adjustments

By default, IB historical data is not dividend-adjusted. However, dividend-adjusted data is available from IB using the ADJUSTED_LAST bar type. This bar type has an important limitation: it is only available with a 1 day bar size.

$ quantrocket history create-db 'us-stk-1d' --universes 'us-stk' --bar-size '1 day' --bar-type 'ADJUSTED_LAST'
status: successfully created quantrocket.history.us-stk-1d.sqlite
>>> from quantrocket.history import create_db
>>> create_db("us-stk-1d", universes=["us-stk"], bar_size="1 day", bar_type="ADJUSTED_LAST")
{'status': 'successfully created quantrocket.history.us-stk-1d.sqlite'}
$ curl -X PUT 'http://houston/history/databases/us-stk-1d?universes=us-stk&bar_size=1 day&bar_type=ADJUSTED_LAST'
{"status": "successfully created quantrocket.history.us-stk-1d.sqlite"}

With ADJUSTED_LAST, QuantRocket handles dividend adjustments in the same way it handles split adjustments: whenever IB applies a dividend adjustment, QuantRocket will detect the discrepancy between the IB data and the data as stored in the history database, and will delete the stored data and re-sync with IB.

Primary vs consolidated prices

By default, IB returns consolidated prices for equities. (Consolidated prices are the aggregated prices across all exchanges where a security trades.) If you run an end-of-day strategy that enters and exits in the opening or closing auction, using consolidated prices may be less accurate than using prices from the primary exchange only. This issue is especially significant in US markets due to after-hours trading and the large number of exchanges and ECNs. (For more on this topic, see this blog post by Ernie Chan.)

You can instruct QuantRocket to collect primary exchange prices instead of consolidated prices using the --primary-exchange option. This instructs IB to filter out trades that didn't take place on the primary listing exchange for the security:

$ quantrocket history create-db 'us-stk-1d' --universes 'us-stk' --bar-size '1 day' --primary-exchange
status: successfully created quantrocket.history.us-stk-1d.sqlite
>>> from quantrocket.history import create_db
>>> create_db("us-stk-1d", universes=["us-stk"], bar_size="1 day", primary_exchange=True)
{'status': 'successfully created quantrocket.history.us-stk-1d.sqlite'}
$ curl -X PUT 'http://houston/history/databases/us-stk-1d?universes=us-stk&bar_size=1 day&primary_exchange=true'
{"status": "successfully created quantrocket.history.us-stk-1d.sqlite"}

Types of data

You can use the --bar-type parameter with create-db to indicate what type of historical data you want:

Bar typeDescriptionAvailable forNotes
TRADEStraded pricestocks, futures, options, forex, indexesadjusted for splits but not dividends
ADJUSTED_LASTtraded pricestocksadjusted for splits and dividends
MIDPOINTbid-ask midpointstocks, futures, options, forexthe open, high, low, and closing midpoint price
BIDbidstocks, futures, options, forexthe open, high, low, and closing bid price
ASKaskstocks, futures, options, forexthe open, high, low, and closing ask price
BID_ASKtime-average bid and askstocks, futures, options, forextime-average bid is stored in the Open field, and time-average ask is stored in the Close field; the High and Low fields contain the max ask and min bid, respectively
HISTORICAL_VOLATILITYhistorical volatilitystocks, indexes30 day Garman-Klass volatility of corporate action adjusted data
OPTION_IMPLIED_VOLATILITYimplied volatilitystocks, indexesIB calculates implied volatility as follows: "The IB 30-day volatility is the at-market volatility estimated for a maturity thirty calendar days forward of the current trading day, and is based on option prices from two consecutive expiration months."

If --bar-type is omitted, it defaults to MIDPOINT for forex and TRADES for everything else.

How far back historical data goes

When collecting historical data, QuantRocket first queries the IB API to determine how far back historical data is available for each security. By default, QuantRocket will collect as much data as is available. However, for large databases, such as intraday databases with many securities, it may be useful to set a fixed start date. Typically, the further back you go, the fewer securities there are with available data. Setting a fixed start date limits the size of your database (reducing initial data collection time and improving backtest speed).

Deciding how far back to collect data is made easier if you know how far back it's possible to go, and how many securities in your universe are available back to any given date. Historical data availability for select exchanges is shown here. For other exchanges or for more up-to-date availability, you can use the following approach. First, create a database with no start date:

$ quantrocket history create-db 'usa-stk-15min' --universes 'usa-stk' --bar-size '15 mins'
status: successfully created quantrocket.history.usa-stk-15min.sqlite
>>> from quantrocket.history import create_db
>>> create_db("usa-stk-15min", universes=["usa-stk"], bar_size="15 mins")
{'status': 'successfully created quantrocket.history.usa-stk-15min.sqlite'}
$ curl -X PUT 'http://houston/history/databases/usa-stk-15min?universes=usa-stk&bar_size=15+mins'
{"status": "successfully created quantrocket.history.usa-stk-15min.sqlite"}
Next, instruct QuantRocket to determine historical data availability but not yet collect the data. For large universes this might take a few hours but is much faster than actually collecting all the data:
$ quantrocket history collect 'usa-stk-15min' --availability
status: the historical data will be collected asynchronously
>>> from quantrocket.history import collect_history
>>> collect_history("usa-stk-15min", availability_only=True)
{'status': 'the historical data will be collected asynchronously'}
$ curl -X POST 'http://houston/history/queue?codes=usa-stk-15min&availability_only=true'
{"status": "the historical data will be collected asynchronously"}

Monitor flightlog, and after the data availability has been saved to your database, use the Python client to query and summarize the start dates:

>>> from quantrocket.history import get_history_availability
>>> start_dates = get_history_availability("usa-stk-15min")
>>> start_dates.head()
ConId
4027   2001-11-29 14:30:00
4050   1980-03-17 14:30:00
4065   1980-03-17 14:30:00
4151   1994-04-15 13:30:00
4157   1980-03-17 14:30:00
Name: StartDate, dtype: datetime64[ns]
>>> # Group start dates by year and plot cumulative totals
>>> cumulative_ticker_counts = start_dates.groupby(start_dates.dt.year).count().cumsum()
>>> # Exclude far future dates, which indicate data is not available
>>> cumulative_ticker_counts = cumulative_ticker_counts[cumulative_ticker_counts.index < 2100]
>>> cumulative_ticker_counts.head()
StartDate
1980    564
1981    591
1982    614
1983    662
1984    693
>>> cumulative_ticker_counts.plot(kind="bar")
When no historical data is available for a particular security, this is indicated by a far future start date of 2200-01-01.

Based on your findings, you can drop and re-create the database with a fixed start date (the historical availability records you just collected are stored in a separate database, quantrocket.history.availability.sqlite, so you won't lose any data when dropping your history database):

$ quantrocket history drop-db 'usa-stk-15min' --confirm-by-typing-db-code-again 'usa-stk-15min'
status: deleted quantrocket.history.usa-stk-15min.sqlite
$ quantrocket history create-db 'usa-stk-15min' --universes 'usa-stk' --bar-size '15 mins' --start-date '2005-01-01'
status: successfully created quantrocket.history.usa-stk-15min.sqlite
>>> from quantrocket.history import drop_db, create_db
>>> drop_db("usa-stk-15min", confirm_by_typing_db_code_again="usa-stk-15min")
{'status': 'deleted quantrocket.history.usa-stk-15min.sqlite'}
>>> create_db("usa-stk-15min", universes=["usa-stk"], bar_size="15 mins", start_date="2005-01-01")
{'status': 'successfully created quantrocket.history.usa-stk-15min.sqlite'}
$ curl -X DELETE 'http://houston/history/databases/usa-stk-15min?confirm_by_typing_db_code_again=usa-stk-15min'
{"status": "deleted quantrocket.history.usa-stk-15min.sqlite"}
$ curl -X PUT 'http://houston/history/databases/usa-stk-15min?universes=usa-stk&bar_size=15+mins&start_date=2005-01-01'
{"status": "successfully created quantrocket.history.usa-stk-15min.sqlite"}
Please note that for intraday bars, IB may not provide historical data as far back as their own reported start dates. For US stocks, no intraday data is available prior to January 2004, even if the reported start date is earlier. For Japan stocks, no intraday data is available prior to March 2004.

Working with large databases

This section is mainly relevant to working with intraday databases with large numbers of securities, for example a 1-minute database of US stocks.

Initial data collection

Depending on the bar size, number of securities, and date range of your historical database, initial data collection from the IB API can take some time. After the initial data collection, keeping your database up to date is much faster and much easier.

QuantRocket fills your historical database by making a series of requests to the IB API to get a portion of the data, from earlier data to later data. The smaller the bars, the more requests are required to collect all the data.

If you run multiple IB Gateways, each with appropriate IB market data subscriptions, QuantRocket splits the requests between the gateways which results in a proportionate reduction in runtime.

IB API response times also vary by the monthly commissions generated on the account. Accounts with monthly commissions of several thousand USD/month or higher will see response times which are about twice as fast as those for small accounts (or for large accounts with small commissions).

The following table shows estimated runtimes and database sizes for a variety of historical database configurations:

Bar sizeNumber of stocksYears of dataExample universesRuntime (high commission account, 4 IB Gateways)Runtime (standard account, 2 IB Gateways)Database size
1 day6,000all available (1980-present)US listed stocks3 hours12 hours2.5 GB
15 minutes6,000all available (2004-present)US listed stocks3 days2 weeks100 GB
1 minute3,0005 yearsone of: NYSE, NASDAQ, TSEJ, LSE1 week1 month300 GB
1 minute6,0005 yearsUS listed stocks2 weeks2 months600 GB
1 minute6,000all available (2004-present)US listed stocks1 month4 months1.2 TB

You can use the table above to infer the collection times for other bar sizes and universe sizes. See the exchanges table on the account page for the approximate number of listings for each exchange.

Data collection strategies

Below are several data collection strategies that may help speed up data collection, reduce the amount of data you need to collect, or allow you to begin working with a subset of data while collecting the full amount of data.

Run multiple IB Gateways

You can cut down initial data collection time by running multiple IB gateways. See the section on obtaining and using multiple IB logins.

Daily bars before intraday bars

Suppose you want to collect intraday bars for the top 1000 liquid securities trading on NYSE and NASDAQ. Instead of collecting intraday bars for all NYSE and NASDAQ securities then filtering out illiquid ones, you could try this approach:

  • collect a year's worth of daily bars for all NYSE and NASDAQ securities (this requires only 1 request to the IB API per security and will run much faster than collecting multiple years of intraday bars)
  • in a notebook, query the daily bars and use them to calculate dollar volume, then create a universe of liquid securities only (see usage guide section on using price data to define universes)
  • collect intraday bars for the universe of liquid securities only

You can periodically repeat this process to update the universe constituents.

Filter by availability of fundamentals

Suppose you have a strategy that requires intraday bars and fundamental data and utilizes a universe of small-cap stocks. For many small-cap stocks, fundamental data won't be available, so it doesn't make sense to spend time collecting intraday historical data for stocks that won't have fundamental data. Instead, collect the fundamental data first and filter your universe to stocks with fundamentals, then collect the historical intraday data. For example:

  • create a universe of all Japanese small-cap stocks called 'japan-sml'
  • collect fundamentals for the universe 'japan-sml'
  • in a notebook, query the fundamentals for 'japan-sml' and use the query results to create a new universe called 'japan-sml-with-fundamentals'
  • collect intraday price history for 'japan-sml-with-fundamentals'

Earlier history before later history

Suppose you want to collect numerous years of intraday bars. But you'd like to test your ideas on a smaller date range first in order to decide if collecting the full history is worthwhile. This can be done as follows. First, define your desired start date when you create the database:

$ quantrocket history create-db 'usa-liquid-15min' -u 'usa-liquid' -z '15 mins' -s '2011-01-01'

The above database is designed to collect data back to 2011-01-01 and up to the present. However, you can temporarily specify an end date when collecting the data:

$ quantrocket history collect 'usa-liquid-15min' -e '2012-01-01'

In this example, only a year of data will be collected (that is, from the start date of 2011-01-01 specified when the database was created to the end date of 2012-01-01 specified in the above command). That way you can start your research sooner. Later, you can repeat this command with a later end date or remove the end date entirely to bring the database current.

In contrast, it's a bad idea to use a temporary start date to shorten the date range and speed up the data collection, with the intention of going back later to get the earlier data. Since data is filled from back to front (that is, from older dates to newer), once you've collected a later portion of data for a given security, you can't append an earlier portion of data without starting over.

Database per decade

Data for some securities goes back 30 years or more. After testing on recent data, you might want to explore earlier years. While you can't append earlier data to an existing database, you can collect the earlier data in a completely separate database. Depending on your bar size and universe size, you might create a separate database for each decade. These databases would be for backtesting only and, after the initial data collection, would not need to be updated. Only your database of the most recent decade would need to be updated.

Small universes before large universes

Another option to get you researching and backtesting sooner is to collect a subset of your target universe before collecting the entire universe. For example, instead of collecting intraday bars for 1000 securities, collect bars for 100 securities and start testing with those while collecting the remaining data.

Don't collect what you don't need

Many of the strategies outlined above can be summarized in one principle: try to keep your databases as small as possible. Small databases are faster to fill initially, take up less disk space, and, most importantly, are faster and easier to work with in research, backtesting, and trading. If you need a large universe of minute bars, by all means collect it, but in light of the runtime and performance costs of working with large amounts of data, it pays to analyze your data requirements in advance and exclude any data you know you won't need.

Database sharding

In database design, "sharding" refers to dividing a large database into multiple smaller databases, with each smaller database or "shard" containing a subset of the total database rows. A collection of database shards typically performs better than a single large database. When a query is run, the rows from each shard are combined into a single result set as if they came from a single database.

Very large databases are too large to load entirely into memory, and sharding doesn't circumvent this. Rather, the purpose of sharding is to allow you to efficiently query the particular subset of data you're interested in at the moment.

For intraday databases with more than 100 securities, QuantRocket's default behavior is to create a sharded database. By default, QuantRocket stores two separate sharded copies of the data. One copy is sharded by time: a separate database shard for each time of day. The other copy is sharded by conid (security): a separate database shard for each security. For example, if you create an intraday database of 15-minute bars called 'usa-stk-15min', you'll see a separate database for 09:30:00 bars, 09:45:00 bars, etc. (with each separate database containing all dates and all securities for only that bar time), and you'll also see a separate database for each security (with each separate database containing all dates and all bar times for only that security):

$ quantrocket db list 'history' 'usa-stk-15min' --expand --detail
- last_modified: '2018-09-05T13:53:44'
  name: quantrocket.history.usa-stk-15min.093000.sqlite
  path: /var/lib/quantrocket/quantrocket.history.usa-stk-15min.sqlite/time/quantrocket.history.usa-stk-15min.093000.sqlite
  size_in_mb: 103.59
- last_modified: '2018-09-05T13:53:44'
  name: quantrocket.history.usa-stk-15min.094500.sqlite
  path: /var/lib/quantrocket/quantrocket.history.usa-stk-15min.sqlite/time/quantrocket.history.usa-stk-15min.094500.sqlite
  size_in_mb: 103.85
...
- last_modified: '2018-09-05T12:43:47'
  name: quantrocket.history.usa-stk-15min.265598.sqlite
  path: /var/lib/quantrocket/quantrocket.history.usa-stk-15min.sqlite/conid/quantrocket.history.usa-stk-15min.265598.sqlite
  size_in_mb: 75.29

You can query the data as normal, and it will be returned as a single result set. QuantRocket will look in whichever copy of the database allows for the most efficient query based on your query parameters, that is, whichever copy allows looking in the fewest number of shards. For example, if you query prices at a few times of day for many securities, QuantRocket will use the time-sharded database to satisfy your request; if you query prices for many times of day for a few securities, QuantRocket will use the conid-sharded database to satisfy your request:

>>> # this query will look in 3 time shards:
>>> #  - quantrocket.history.usa-stk-15min.094500.sqlite
>>> #  - quantrocket.history.usa-stk-15min.120000.sqlite
>>> #  - quantrocket.history.usa-stk-15min.154500.sqlite
>>> prices = get_historical_prices("usa-stk-15min", times=["09:30:00", "12:00:00", "15:45:00"])
>>> # this query will look in 2 conid shards:
>>> #  - quantrocket.history.usa-stk-15min.265598.sqlite
>>> #  - quantrocket.history.usa-stk-15min.4075.sqlite
>>> prices = get_historical_prices("usa-stk-15min", conids=[265598, 4075])

Sharding by time and by security allows for the most flexible querying but requires double the disk space. If you want to control how (or whether) your database is sharded, you can do so at the time you create it:

$ # shard by conid only
$ quantrocket history create-db 'usa-stk-15min' --universes 'usa-stk' --bar-size '15 mins' --shard 'conid'
status: successfully created quantrocket.history.usa-stk-15min.sqlite
>>> # shard by conid only
>>> from quantrocket.history import create_db
>>> create_db("usa-stk-15min", universes=["usa-stk"], bar_size="15 mins", shard="conid")
{'status': 'successfully created quantrocket.history.usa-stk-15min.sqlite'}
$ # shard by conid only
$ curl -X PUT 'http://houston/history/databases/usa-stk-15min?universes=usa-stk&bar_size=15 mins&shard=conid'
{"status": "successfully created quantrocket.history.usa-stk-15min.sqlite"}

Time filters for intraday databases

When creating a historical database of intraday bars, you can use the times or between-times options to filter out unwanted bars.

For example, it's usually a good practice to explicitly specify the session start and end times, as the IB API sometimes sends a small number of bars from outside regular trading hours, and any trading activity from these bars will be included in the cumulative daily totals calculated by QuantRocket. The following command instructs QuantRocket to keep only those bars that fall between 9:30 and 15:45, inclusive. (Note that bar times correspond to the start of the bar, so the final bar for US stocks using 15-min bars would be 15:45:00.)

$ quantrocket history create-db 'nasdaq-stk-15min' --universes 'nasdaq-stk' --bar-size '15 mins' --between-times '09:30:00' '15:45:00'
status: successfully created quantrocket.history.nasdaq-stk-15min.sqlite
>>> from quantrocket.history import create_db
>>> create_db("nasdaq-stk-15min", universes=["nasdaq-stk"], bar_size="15 mins", between_times=["09:30:00", "15:45:00"])
{'status': 'successfully created quantrocket.history.nasdaq-stk-15min.sqlite'}
$ curl -X PUT 'http://houston/history/databases/nasdaq-stk-15min?universes=nasdaq-stk&bar_size=15+mins&between_times=09%3A30%3A00&between_times=15%3A45%3A00'
{"status": "successfully created quantrocket.history.nasdaq-stk-15min.sqlite"}
You can view the database config to see how QuantRocket expanded the between-times values into an explicit list of times to keep:
$ quantrocket history config "nasdaq-stk-15min"
bar_size: 15 mins
times:
- 09:30:00
- 09:45:00
- '10:00:00'
- '10:15:00'
- '10:30:00'
- '10:45:00'
- '11:00:00'
- '11:15:00'
- '11:30:00'
- '11:45:00'
- '12:00:00'
- '12:15:00'
- '12:30:00'
- '12:45:00'
- '13:00:00'
- '13:15:00'
- '13:30:00'
- '13:45:00'
- '14:00:00'
- '14:15:00'
- '14:30:00'
- '14:45:00'
- '15:00:00'
- '15:15:00'
- '15:30:00'
- '15:45:00'
universes:
- nasdaq-stk
vendor: ib
>>> from quantrocket.history import get_db_config
>>> get_db_config("nasdaq-stk-15min")
 {'bar_size': '15 mins',
  'times': ['09:30:00',
   '09:45:00',
   '10:00:00',
   '10:15:00',
   '10:30:00',
   '10:45:00',
   '11:00:00',
   '11:15:00',
   '11:30:00',
   '11:45:00',
   '12:00:00',
   '12:15:00',
   '12:30:00',
   '12:45:00',
   '13:00:00',
   '13:15:00',
   '13:30:00',
   '13:45:00',
   '14:00:00',
   '14:15:00',
   '14:30:00',
   '14:45:00',
   '15:00:00',
   '15:15:00',
   '15:30:00',
   '15:45:00'],
  'universes': ['nasdaq-stk'],
  'vendor': 'ib'}
$ curl 'http://houston/history/databases/nasdaq-stk-15min'
{"universes": ["nasdaq-stk"], "bar_size": "15 mins", "vendor": "ib", "times": ["09:30:00", "09:45:00", "10:00:00", "10:15:00", "10:30:00", "10:45:00", "11:00:00", "11:15:00", "11:30:00", "11:45:00", "12:00:00", "12:15:00", "12:30:00", "12:45:00", "13:00:00", "13:15:00", "13:30:00", "13:45:00", "14:00:00", "14:15:00", "14:30:00", "14:45:00", "15:00:00", "15:15:00", "15:30:00", "15:45:00"]}
More selectively, if you know you only care about particular times, you can keep only those times, which will result in a smaller, faster database:
$ quantrocket history create-db 'nasdaq-stk-15min' --universes 'nasdaq-stk' --bar-size '15 mins' --times '09:30:00' '09:45:00' '10:00:00' '15:45:00'
status: successfully created quantrocket.history.nasdaq-stk-15min.sqlite
>>> from quantrocket.history import create_db
>>> create_db("nasdaq-stk-15min", universes=["nasdaq-stk"], bar_size="15 mins", times=["09:30:00", "09:45:00", "10:00:00", "15:45:00"])
{'status': 'successfully created quantrocket.history.nasdaq-stk-15min.sqlite'}
$ curl -X PUT 'http://houston/history/databases/nasdaq-stk-15min?universes=nasdaq-stk&bar_size=15+mins&times=09%3A30%3A00&times=09%3A45%3A00&times=10%3A00%3A00&times=15%3A45%3A00'
{"status": "successfully created quantrocket.history.nasdaq-stk-15min.sqlite"}

The downside of keeping only a few times is that you'll have to collect data again if you later decide you want to analyze prices at other times of the session. An alternative is to save all the times but filter by time when querying the data, as described below.

Load only what you need

The more data you load into Pandas, the slower the performance will be. Therefore, it's a good idea to filter the dataset before loading it, particularly when working with large universes and intraday bars. Use the fields, times, start_date, and end_date parameters to load only the data you need:

>>> prices = get_historical_prices("usa-stk-15min", start_date="2017-01-01", fields=["Open","Close"], times=["09:30:00", "15:45:00"])

Historical data analysis in Python

Daily historical data

Using the Python client, you can load historical data into a Pandas DataFrame:

>>> from quantrocket.history import get_historical_prices
>>> prices = get_historical_prices("japan-bank-eod", start_date="2017-01-01", fields=["Open","High","Low","Close", "Volume"])

The DataFrame will have a column for each security (represented by conids). For daily bar sizes and larger, the DataFrame will have a two-level index: an outer level for each field (Open, Close, Volume, etc.) and an inner level containing a DatetimeIndex:

>>> prices.head()
ConId              13857203   13905344   13905462   13905522   13905624   \
Field Date
Close 2017-01-04    11150.0     3853.0     4889.0     4321.0     2712.0
      2017-01-05    11065.0     3910.0     4927.0     4299.0     2681.0
      2017-01-06    11105.0     3918.0     4965.0     4266.0     2672.5
      2017-01-10    11210.0     3886.0     4965.0     4227.0     2640.0
      2017-01-11    11115.0     3860.0     4970.0     4208.0     2652.0
...
Volume 2018-01-29   685800.0  2996700.0  1000600.0  1339000.0  6499600.0
       2018-01-30   641700.0  2686100.0  1421900.0  1709900.0  7039800.0
       2018-01-31   603400.0  3179000.0  1517100.0  1471000.0  5855500.0
       2018-02-01   447300.0  3300900.0  1295800.0  1329600.0  5540600.0
       2018-02-02   510200.0  4739800.0  2060500.0  1145200.0  5585300.0

The DataFrame can be thought of as several stacked DataFrames, one for each field. You can use .loc to isolate a DataFrame for each field:

>>> closes = prices.loc["Close"]
>>> closes.head()
ConId        13857203   13905344   13905462   13905522   13905624   13905665   \
Date
2017-01-04    11150.0     3853.0     4889.0     4321.0     2712.0      655.9
2017-01-05    11065.0     3910.0     4927.0     4299.0     2681.0      658.4
2017-01-06    11105.0     3918.0     4965.0     4266.0     2672.5      656.2
2017-01-10    11210.0     3886.0     4965.0     4227.0     2640.0      652.8
2017-01-11    11115.0     3860.0     4970.0     4208.0     2652.0      665.1

Each field's DataFrame has the same columns and index, which makes it easy to perform matrix operations. For example, calculate dollar volume (or Euro volume, Yen volume, etc. depending on the universe):

>>> volumes = prices.loc["Volume"]
>>> dollar_volumes = closes * volumes

Or calculate overnight (close-to-open) returns:

>>> opens = prices.loc["Open"]
>>> prior_closes = closes.shift()
>>> overnight_returns = (opens - prior_closes) / prior_closes
>>> overnight_returns.head()
ConId        13857203   13905344   13905462   13905522   13905624   13905665   \
Date
2017-01-04        NaN        NaN        NaN        NaN        NaN        NaN
2017-01-05   0.001345   0.004412   0.003477  -0.002083   0.002765   0.021497
2017-01-06  -0.000904  -0.005115  -0.000812  -0.011165  -0.016039  -0.012606
2017-01-10  -0.003152  -0.006891   0.009869  -0.008204  -0.011038  -0.002591
2017-01-11   0.000446  -0.000257   0.007049   0.004968   0.001894   0.009498

Intraday historical data

In contrast to daily bars, the stacked DataFrame for intraday bars is a three-level index, consisting of the field, the date, and the time as a string (for example, 09:30:00):

>>> prices = get_historical_prices("etf-1h", start_date="2017-01-01", fields=["Open","High","Low","Close", "Volume"])
>>> prices.head()
ConId                         756733  72195411  73128548
Field Date        Time
Close 2017-07-20  09:30:00    247.28    324.30    216.27
                  10:00:00    247.08    323.94    216.25
                  11:00:00    246.97    323.63    215.90
                  12:00:00    247.25    324.11    216.22
                  13:00:00    247.29    324.32    216.22
...
Volume 2017-08-04 11:00:00   5896400.0  168700.0  170900.0
                  12:00:00   2243700.0  237300.0  114100.0
                  13:00:00   2228000.0  113900.0  107600.0
                  14:00:00   2841400.0   84500.0  116700.0
                  15:00:00  11351600.0  334000.0  357000.0

As with daily bars, use .loc to isolate a particular field.

>>> closes = prices.loc["Close"]
>>> closes.head()
ConId                  756733  72195411  73128548
Date       Time
2017-07-20 09:30:00    247.28    324.30    216.27
           10:00:00    247.08    323.94    216.25
           11:00:00    246.97    323.63    215.90
           12:00:00    247.25    324.11    216.22
           13:00:00    247.29    324.32    216.22

To isolate a particular time, use Pandas' .xs method (short for "cross-section"):

>>> session_closes = closes.xs("15:45:00", level="Time")
>>> session_closes.head()
ConId         756733  72195411  73128548
Date
2017-07-20    247.07    323.84    216.16
2017-07-21    246.89    322.93    215.53
2017-07-24    246.81    323.50    215.09
2017-07-25    247.39    326.37    215.88
2017-07-26    247.45    323.36    216.81
A bar's time represents the start of the bar. Thus, to get the 4:00 PM closing price using 15-minute bars, you would look at the close of the "15:45:00" bar. To get the 3:45 PM price using 15-minute bars, you could look at the open of the "15:45:00" bar or the close of the "15:30:00" bar.

After taking a cross-section of an intraday DataFrame, you can perform matrix operations with bars from different times of day:

>>> opens = prices.loc["Open"]
>>> session_opens = opens.xs("09:30:00", level="Time")
>>> session_closes = closes.xs("15:45:00", level="Time")
>>> prior_session_closes = session_closes.shift()
>>> overnight_returns = (session_opens - prior_session_closes) / prior_session_closes
>>> overnight_returns.head()
ConId       756733    72195411  73128548
Date
2017-07-20       NaN       NaN       NaN
2017-07-21 -0.002509 -0.001637 -0.004441
2017-07-24 -0.000405 -0.000929 -0.000139
2017-07-25  0.003525  0.005286  0.006555
2017-07-26  0.001455  0.000123  0.004308

Timezone of intraday data

Intraday historical data is stored in the database in ISO-8601 format, which consists of the date followed by the time in the local timezone of the exchange, followed by a UTC offset. For example, a 9:30 AM bar for a stock trading on the NYSE might have a timestamp of 2017-07-25T09:30:00-04:00, where -04:00 indicates that New York is 4 hours behind Greenwich Mean Time/UTC. This storage format allows QuantRocket to properly align data that may originate from different timezones.

If you don't specify the timezone parameter when loading prices into Pandas using get_historical_prices, the function will infer the timezone from the data itself. (This is accomplished by querying the securities master database to determine the timezone of the securities in your dataset.) This approach works fine as long as your data originates from a single timezone. If multiple timezones are represented, an error will be raised.

>>> prices = get_historical_prices("aapl-arb-5min")
ParameterError: cannot infer timezone because multiple timezones are present in data, please specify timezone explicitly (timezones: America/New_York, America/Mexico_City)

In this case, you should manually specify the timezone to which you want the data to be aligned:

>>> prices = get_historical_prices("aapl-arb-5min", timezone="America/New_York")

Historical data with a bar size of 1 day or higher is stored and returned in YYYY-MM-DD format. Specifying a timezone for such a database has no effect.

Append securities master fields

Sometimes it might be useful to use securities master fields such as the primary exchange in your data analysis. To do so, first request the desired master_fields:

>>> prices = get_historical_prices("usa-1d", fields=["Close","Open"], master_fields=["PrimaryExchange"], start_date="2018-01-01")

You can isolate the securities master fields using .loc, like any other field. The securities master values are indexed to the min date of your historical data:

>>> exchanges = prices.loc["PrimaryExchange"]
>>> exchanges.head()
ConId           4027      4157      4205 309001160 309221203
Date
2018-01-02      NYSE    NASDAQ      NYSE      AMEX    NASDAQ

You can easily forward-fill the values to match the shape of your price data:

>>> closes = prices.loc["Close"]
>>> exchanges = exchanges.reindex(closes.index, method="ffill")
>>> closes.where(exchanges=="NYSE").head()
ConId           4027      4157      4205 309001160 309221203
Date
2018-01-02    106.09       NaN     46.73       NaN       NaN
2018-01-03    107.05       NaN     46.46       NaN       NaN
2018-01-04       111       NaN     46.86       NaN       NaN
2018-01-05    112.18       NaN     47.02       NaN       NaN
2018-01-08    111.39       NaN     47.13       NaN       NaN

Cumulative daily prices for intraday data

For historical databases with bar sizes smaller than 1 day, QuantRocket will calculate and store the day's high, low, and volume as of each intraday bar. When querying intraday data, the additional fields DayHigh, DayLow, and DayVolume are available. Other fields represent only the trading activity that occurred within the duration of a particular bar: for example, the Volume field for a 15:00:00 bar in a database with 1-hour bars represents the trading volume from 15:00:00 to 16:00:00. In contrast, DayHigh, DayLow, and DayVolume represent the trading activity for the entire day up to and including the particular bar.

>>> prices = get_historical_prices(
              "spy-1h",
              fields=["Open","High","Low","Close","Volume","DayHigh","DayLow","DayVolume"])
>>> # Below, the volume from 15:00 to 16:00 is 16.9M shares, while the day's total
>>> # volume through 16:00 (the end of the bar) is 48M shares. The low between
>>> # 15:00 and 16:00 is 272.97, while the day's low is 272.42.
>>> prices.xs("2018-03-08", level="Date").xs("15:00:00", level="Time")
ConId           756733
Field
Close           274.09
DayHigh         274.24
DayLow          272.42
DayVolume  48126000.00
High            274.24
Low             272.97
Open            273.66
Volume     16897100.00

A common use case for cumulative daily totals is if your research idea or trading strategy needs a selection of intraday prices but also needs access to daily price fields (e.g. to calculate average daily volume). Instead of requesting and aggregating all intraday bars (which for large universes might require loading too much data), you can use the times parameter to load only the intraday bars you need, including the final bar of the trading session to give you access to the daily totals. For example, here is how you might screen for stocks with heavy volume in the opening 30 minutes relative to their average volume:

>>> # load the 9:45-10:00 bar and the 15:45-16:00 bar
>>> prices = get_historical_prices("usa-stk-15min", start_date="2018-01-01", times=["09:45:00","15:45:00"], fields=["DayVolume"])
>>> # the 09:45:00 bar contains the cumulative volume through the end of the bar (10:00:00)
>>> early_session_volumes = prices.loc["DayVolume"].xs("09:45:00", level="Time")
>>> # the 15:45:00 bar contains the cumulative volume for the entire day
>>> daily_volumes = prices.loc["DayVolume"].xs("15:45:00", level="Time")
>>> avg_daily_volumes = daily_volumes.rolling(window=30).mean()
>>> # look for early volume that is more than twice the average daily volume
>>> volume_surges = early_session_volumes > (avg_daily_volumes.shift() * 2)
Cumulative daily totals are calculated directly from the intraday data in your database and thus will reflect any times or between-times filters used when creating the database.

Continuous Futures

You can use QuantRocket to query futures as continuous contracts. QuantRocket collects and stores data for each individual futures expiry, but can optionally stitch the data into a continuous contract at query time.

Suppose we've created a universe of all expiries of KOSPI 200 futures, trading on the Korea Stock Exchange:

$ quantrocket master listings --exchange 'KSE' --sec-types 'FUT' --symbols 'K200'
status: the listing details will be collected asynchronously
$ # wait for listings to be collected, then:
$ quantrocket master get -e 'KSE' -t 'FUT' -s 'K200' | quantrocket master universe 'k200' -f '-'
code: k200
inserted: 15
provided: 15
total_after_insert: 15
>>> from quantrocket.master import collect_listings, create_universe, download_master_file
>>> import io
>>> collect_listings(exchange="KSE", sec_types=["FUT"], symbols=["K200"])
{'status': 'the listing details will be collected asynchronously'}
>>> # wait for listings to be collected, then:
>>> f = io.StringIO()
>>> download_master_file(f, exchanges=["KSE"], sec_types=["FUT"], symbols=["K200"])
>>> create_universe("k200", infilepath_or_buffer=f)
{'code': 'k200', 'inserted': 15, 'provided': 15, 'total_after_insert': 15}
$ curl -X POST 'http://houston/master/listings?exchange=KSE&sec_types=FUT&symbols=K200'
{"status": "the listing details will be collected asynchronously"}
$ # wait for listings to be collected, then:
$  curl -X GET 'http://houston/master/securities.csv?exchanges=KSE&sec_types=FUT&symbols=K200' > k200.csv
$ curl -X PUT 'http://houston/master/universes/k200' --upload-file k200.csv
{"code": "k200", "provided": 15, "inserted": 15, "total_after_insert": 15}
We can create a history database and collect historical data for each expiry:
$ quantrocket history create-db 'k200-1h' --universes 'k200' --bar-size '1 hour'
status: successfully created quantrocket.history.k200-1h.sqlite
$ quantrocket history collect 'k200-1h'
status: the historical data will be collected asynchronously
>>> from quantrocket.history import create_db, collect_history
>>> create_db("k200-1h", universes=["k200"], bar_size="1 hour")
{'status': 'successfully created quantrocket.history.k200-1h.sqlite'}
>>> collect_history("k200-1h")
{'status': 'the historical data will be collected asynchronously'}
$ curl -X PUT 'http://houston/history/databases/k200-1h?universes=k200&bar_size=1 hour'
{"status": "successfully created quantrocket.history.k200-1h.sqlite"}
$ curl -X POST 'http://houston/history/queue?codes=k200-1h'
{"status": "the historical data will be collected asynchronously"}
The historical prices for each futures expiry are stored separately and by default are returned separately at query time, but we can optionally tell QuantRocket to stitch the contracts together at query time. The fastest way of stitching the contracts together is using simple concatenation:
$ quantrocket history get 'k200-1h' --fields 'Open' 'Close' 'Volume' --outfile 'k200_1h.csv' --cont-fut 'concat'
>>> from quantrocket.history import download_history_file
>>> download_history_file("k200-1h", filepath_or_buffer="k200_1h.csv", fields=["Open","Close", "Volume"], cont_fut="concat")
$ curl -X GET 'http://houston/history/k200-1h.csv?fields=Open&fields=Close&fields=Volume&cont_fut=concat' > k200_1h.csv

The contracts will be stitched together according to the rollover dates as configured in the master service, and the continuous contract will be returned under the conid of the current front-month contract.

A history database need not contain only futures in order to use the continuous futures query option. The option will be ignored for any non-futures, which will be returned as stored. Any futures in the database will be grouped together by symbol, exchange, currency, and multiplier in order to create the continuous contracts. The continuous contracts will be returned alongside the non-futures.

Fundamental Data

This section is about collecting and analyzing fundamental data. For an introductory overview of the available data, see the Reuters Fundamentals data overview.

Reuters estimates and actuals

Collect Reuters estimates

To use Reuters estimates and actuals in QuantRocket, first collect the data from IB into your QuantRocket database. Then you can run queries against the database in your research and backtests.

To collect analyst estimates and actuals, specify one or more conids or universes to collect data for:

$ quantrocket fundamental collect-estimates --universes 'japan-banks' 'singapore-banks'
status: the fundamental data will be collected asynchronously
>>> from quantrocket.fundamental import collect_reuters_estimates
>>> collect_reuters_financials(universes=["japan-banks","singapore-banks"])
{'status': 'the fundamental data will be collected asynchronously'}
$ curl -X POST 'http://houston/fundamental/reuters/estimates?universes=japan-banks&universes=singapore-banks'
{"status": "the fundamental data will be collected asynchronously"}

Multiple requests will be queued and processed sequentially. You can monitor flightlog via the command line or Papertrail to track progress:

$ quantrocket flightlog stream
2017-11-23 14:13:22 quantrocket.fundamental: INFO Collecting Reuters estimates from IB for universes japan-banks, singapore-banks
2017-11-23 14:15:35 quantrocket.fundamental: INFO Expected remaining runtime to collect Reuters estimates for universes japan-banks, singapore-banks: 0:04:25
2017-11-23 14:24:01 quantrocket.fundamental: INFO Saved 3298 total records for 60 total securities to quantrocket.fundamental.reuters.estimates.sqlite for universes japan-banks, singapore-banks

Query Reuters estimates

To query Reuters estimates and actuals, first look up the code(s) for the metrics you care about:

$ quantrocket fundamental codes --report-types 'estimates'
estimates:
  BVPS: Book Value Per Share
  CAPEX: Capital Expenditure
  CPS: Cash Flow Per Share
  DPS: Dividend Per Share
  EBIT: Earnings Before Interest and Tax
...
>>> from quantrocket.fundamental import list_reuters_codes
>>> list_reuters_codes(report_types=["estimates"])
{'estimates': {'BVPS': 'Book Value Per Share',
  'CAPEX': 'Capital Expenditure',
  'CPS': 'Cash Flow Per Share',
  'DPS': 'Dividend Per Share',
  'EBIT': 'Earnings Before Interest and Tax',
...
}}
$ curl -X GET 'http://houston/fundamental/reuters/codes?report_types=estimates'
{"estimates": {"BVPS": "Book Value Per Share", "CAPEX": "Capital Expenditure", "CPS": "Cash Flow Per Share", "DPS": "Dividend Per Share", "EBIT": "Earnings Before Interest and Tax",...}}

Let's query EPS estimates and actuals:

$ quantrocket fundamental estimates 'EPS' -u 'us-banks' -s '2014-01-01' -e '2017-01-01' -o eps_estimates.csv
$ csvlook -I --max-columns 10 --max-rows 5 eps_estimates.csv
| ConId | Indicator | Unit | FiscalYear | FiscalPeriodEndDate | FiscalPeriodType | FiscalPeriodNumber | High | Low  | Mean   | ... |
| ----- | --------- | ---- | ---------- | ------------------- | ---------------- | ------------------ | ---- | ---- | ------ | --- |
| 9029  | EPS       | U    | 2014       | 2014-03-31          | Q                | 1                  | 0.31 | 0.2  | 0.255  | ... |
| 9029  | EPS       | U    | 2014       | 2014-06-30          | Q                | 2                  | 0.77 | 0.73 | 0.7467 | ... |
| 9029  | EPS       | U    | 2014       | 2014-09-30          | Q                | 3                  | 0.71 | 0.63 | 0.6667 | ... |
| 9029  | EPS       | U    | 2014       | 2014-12-31          | A                |                    | 2.25 | 2.23 | 2.2433 | ... |
| 9029  | EPS       | U    | 2014       | 2014-12-31          | Q                | 4                  | 0.49 | 0.47 | 0.4833 | ... |
| ...   | ...       | ...  | ...        | ...                 | ...              | ...                | ...  | ...  | ...    | ... |
>>> from quantrocket.fundamental import download_reuters_estimates
>>> import io
>>> import pandas as pd
>>> f = io.StringIO()
>>> download_reuters_estimates(["EPS"],f,universes=["us-banks"],
                            start_date="2014-01-01", end_date="2017-01-01")
>>> eps_estimates = pd.read_csv(f, parse_dates=["FiscalPeriodEndDate", "AnnounceDate"])
>>> eps_estimates.head()
    ConId Indicator Unit  FiscalYear FiscalPeriodEndDate FiscalPeriodType  \
0   9029       EPS    U        2014          2014-03-31                Q
1   9029       EPS    U        2014          2014-06-30                Q
2   9029       EPS    U        2014          2014-09-30                Q
3   9029       EPS    U        2014          2014-12-31                A
4   9029       EPS    U        2014          2014-12-31                Q

   FiscalPeriodNumber  High   Low    Mean  Median  StdDev  NumOfEst  \
0                 1.0  0.31  0.20  0.2550   0.255  0.0550       2.0
1                 2.0  0.77  0.73  0.7467   0.740  0.0170       3.0
2                 3.0  0.71  0.63  0.6667   0.660  0.0330       3.0
3                 NaN  2.25  2.23  2.2433   2.250  0.0094       3.0
4                 4.0  0.49  0.47  0.4833   0.490  0.0094       3.0

         AnnounceDate          UpdatedDate  Actual
0 2014-05-01 11:45:00  2014-05-01T12:06:31    0.12
1 2014-07-31 11:45:00  2014-07-31T13:47:24    1.02
2 2014-11-04 12:45:00  2014-11-04T13:27:49    0.62
3 2015-02-27 12:45:00  2015-02-27T13:20:27    2.29
4 2015-02-27 12:45:00  2015-02-27T13:20:26    0.53
$ curl -X GET 'http://houston/fundamental/reuters/estimates.csv?codes=EPS&universes=us-banks&start_date=2014-01-01&end_date=2017-01-01' --output eps_estimates.csv
$ head eps_estimates.csv
ConId,Indicator,Unit,FiscalYear,FiscalPeriodEndDate,FiscalPeriodType,FiscalPeriodNumber,High,Low,Mean,Median,StdDev,NumOfEst,AnnounceDate,UpdatedDate,Actual
9029,EPS,U,2014,2014-03-31,Q,1,0.31,0.2,0.255,0.255,0.055,2,2014-05-01T11:45:00,2014-05-01T12:06:31,0.12
9029,EPS,U,2014,2014-06-30,Q,2,0.77,0.73,0.7467,0.74,0.017,3,2014-07-31T11:45:00,2014-07-31T13:47:24,1.02
9029,EPS,U,2014,2014-09-30,Q,3,0.71,0.63,0.6667,0.66,0.033,3,2014-11-04T12:45:00,2014-11-04T13:27:49,0.62
9029,EPS,U,2014,2014-12-31,A,,2.25,2.23,2.2433,2.25,0.0094,3,2015-02-27T12:45:00,2015-02-27T13:20:27,2.29
9029,EPS,U,2014,2014-12-31,Q,4,0.49,0.47,0.4833,0.49,0.0094,3,2015-02-27T12:45:00,2015-02-27T13:20:26,0.53

Reuters estimates aligned to prices

You can use a DataFrame of historical prices to get Reuters estimates and actuals data that is aligned to the price data. This makes it easy to perform matrix operations using fundamental data.

>>> from quantrocket.history import get_historical_prices
>>> prices = get_historical_prices("japan-bank-eod", start_date="2017-01-01", fields=["Open","High","Low","Close", "Volume"])
>>> closes = prices.loc["Close"]

For intraday databases, use .loc and .xs to isolate a particular field and time, so that the DataFrame index consists only of dates. Again, the particular field and time don't matter, as only the columns and index will be used:

>>> from quantrocket.history import get_historical_prices
>>> prices = get_historical_prices("japan-bank-15min", start_date="2017-01-01", fields=["Close", "Volume"])
>>> closes = prices.loc["Close"].xs("15:30:00", level="Time")

Now use the DataFrame of prices to get a DataFrame of estimates and actuals.

>>> # Query earnings per share (EPS) and book value per share (BVPS)
>>> from quantrocket.fundamental import get_reuters_estimates_reindexed_like
>>> estimates = get_reuters_estimates_reindexed_like(
                     closes,
                     codes=["EPS", "BVPS"])

Similar to historical data, the resulting DataFrame can be thought of as several stacked DataFrames, with a MultiIndex consisting of the indicator code, the field (by default only Actual is returned), and the date. Note that get_reuters_estimates_reindexed_like shifts values forward by one day to avoid any lookahead bias.

>>> estimates.head()
ConId                       265598    3691937   15124833  208813719
Indicator Field  Date
BVPS      Actual 2016-01-04     21.39   26.5032    5.0875   336.454
                 2016-01-05     21.39   26.5032    5.0875   336.454
                 2016-01-06     21.39   26.5032    5.0875   336.454
                 2016-01-07     21.39   26.5032    5.0875   336.454
                 2016-01-08     21.39   26.5032    5.0875   336.454
...
EPS       Actual 2016-01-04      1.96      0.17      0.07      7.35
                 2016-01-05      1.96      0.17      0.07      7.35
                 2016-01-06      1.96      0.17      0.07      7.35
                 2016-01-07      1.96      0.17      0.07      7.35
                 2016-01-08      1.96      0.17      0.07      7.35

You can use .loc to isolate a particular indicator and field and perform matrix operations:

>>> book_values_per_share = estimates.loc["BVPS"].loc["Actual"]

Since the columns and date index match that of the historical data, you can perform matrix operations on prices and estimates/actuals together:

>>> # calculate price-to-book ratio
>>> pb_ratios = closes/book_values_per_share

Reuters financial statements

Collect Reuters financials

To use Reuters financial statements in QuantRocket, first collect the data from IB into your QuantRocket database. Then you can run queries against the database in your research and backtests.

To collect financial statements, specify one or more conids or universes to collect data for:

$ quantrocket fundamental collect-financials --universes 'japan-banks' 'singapore-banks'
status: the fundamental data will be collected asynchronously
>>> from quantrocket.fundamental import collect_reuters_financials
>>> collect_reuters_financials(universes=["japan-banks","singapore-banks"])
{'status': 'the fundamental data will be collected asynchronously'}
$ curl -X POST 'http://houston/fundamental/reuters/financials?universes=japan-banks&universes=singapore-banks'
{"status": "the fundamental data will be collected asynchronously"}

Multiple requests will be queued and processed sequentially. You can monitor flightlog via the command line or Papertrail to track progress:

$ quantrocket flightlog stream
2017-11-22 09:30:10 quantrocket.fundamental: INFO Collecting Reuters financials from IB for universes japan-banks, singapore-banks
2017-11-22 09:30:45 quantrocket.fundamental: INFO Expected remaining runtime to collect Reuters financials for universes japan-banks, singapore-banks: 0:00:33
2017-11-22 09:32:09 quantrocket.fundamental: INFO Saved 12979 total records for 100 total securities to quantrocket.fundamental.reuters.financials.sqlite for universes japan-banks, singapore-banks

Query Reuters financials

To query Reuters financials, first look up the code(s) for the metrics you care about, optionally limiting to a particular statement type:

$ quantrocket fundamental codes --report-types 'financials' --statement-types 'CAS'
financials:
  FCDP: Total Cash Dividends Paid
  FPRD: Issuance (Retirement) of Debt, Net
  FPSS: Issuance (Retirement) of Stock, Net
  FTLF: Cash from Financing Activities
  ITLI: Cash from Investing Activities
  OBDT: Deferred Taxes
  OCPD: Cash Payments
  OCRC: Cash Receipts
...
>>> from quantrocket.fundamental import list_reuters_codes
>>> list_reuters_codes(report_types=["financials"], statement_types=["CAS"])
{'financials': {'FCDP': 'Total Cash Dividends Paid',
  'FPRD': 'Issuance (Retirement) of Debt, Net',
  'FPSS': 'Issuance (Retirement) of Stock, Net',
  'FTLF': 'Cash from Financing Activities',
  'ITLI': 'Cash from Investing Activities',
  'OBDT': 'Deferred Taxes',
  'OCPD': 'Cash Payments',
  'OCRC': 'Cash Receipts',
...
}}
$ curl -X GET 'http://houston/fundamental/reuters/codes?report_types=financials&statement_types=CAS'
{"financials": {"FCDP": "Total Cash Dividends Paid", "FPRD": "Issuance (Retirement) of Debt, Net", "FPSS": "Issuance (Retirement) of Stock, Net", "FTLF": "Cash from Financing Activities", "ITLI": "Cash from Investing Activities", "OBDT": "Deferred Taxes", "OCPD": "Cash Payments", "OCRC": "Cash Receipts",...}}
QuantRocket reads the codes from the financial statements database; therefore, you must collect data into the database before you can list the available codes.

Let's query Net Income Before Taxes (code EIBT) for a universe of securities:

$ quantrocket fundamental financials 'EIBT' -u 'us-banks' -s '2014-01-01' -e '2017-01-01' -o financials.csv
$ csvlook -I --max-columns 6 --max-rows 5 financials.csv
| CoaCode | ConId | Amount | FiscalYear | FiscalPeriodEndDate | FiscalPeriodType | ... |
| ------- | ----- | ------ | ---------- | ------------------- | ---------------- | --- |
| EIBT    | 9029  | 13.53  | 2014       | 2014-12-31          | Annual           | ... |
| EIBT    | 9029  | 28.117 | 2015       | 2015-12-31          | Annual           | ... |
| EIBT    | 12190 | -7.307 | 2014       | 2014-05-31          | Annual           | ... |
| EIBT    | 12190 | -4.188 | 2015       | 2015-05-31          | Annual           | ... |
| EIBT    | 12190 | 1.873  | 2016       | 2016-05-31          | Annual           | ... |
| ...     | ...   | ...    | ...        | ...                 | ...              | ... |
>>> from quantrocket.fundamental import download_reuters_financials
>>> import io
>>> import pandas as pd
>>> f = io.StringIO()
>>> download_reuters_financials(["EIBT"],f,universes=["us-banks"],
                            start_date="2014-01-01", end_date="2017-01-01")
>>> financials = pd.read_csv(f, parse_dates=["SourceDate", "FiscalPeriodEndDate"])
>>> financials.head()
CoaCode   ConId  Amount  FiscalYear FiscalPeriodEndDate FiscalPeriodType  \
0    EIBT    9029  13.530        2014          2014-12-31           Annual
1    EIBT    9029  28.117        2015          2015-12-31           Annual
2    EIBT   12190  -4.188        2015          2015-05-31           Annual
3    EIBT   12190   1.873        2016          2016-05-31           Annual
4    EIBT  270422  -3.770        2015          2015-09-30           Annual

 FiscalPeriodNumber StatementType  StatementPeriodLength  \
0                 NaN           INC                     12
1                 NaN           INC                     12
2                 NaN           INC                     12
3                 NaN           INC                     12
4                 NaN           INC                     12

StatementPeriodUnit UpdateTypeCode UpdateTypeDescription StatementDate  \
0                   M            UPD        Updated Normal    2014-12-31
1                   M            UPD        Updated Normal    2015-12-31
2                   M            UPD        Updated Normal    2015-05-31
3                   M            UPD        Updated Normal    2016-05-31
4                   M            UPD        Updated Normal    2015-09-30

AuditorNameCode        AuditorName AuditorOpinionCode AuditorOpinion Source  \
0              EY  Ernst & Young LLP                UNQ    Unqualified   10-K
1              EY  Ernst & Young LLP                UNQ    Unqualified   10-K
2            CROW  Crowe Horwath LLP                UNQ    Unqualified   10-K
3            CROW  Crowe Horwath LLP                UNQ    Unqualified   10-K
4            CROW  Crowe Horwath LLP                UNQ    Unqualified   10-K

SourceDate
0 2015-03-13
1 2016-02-29
2 2015-08-26
3 2016-08-05
4 2015-12-18
$ curl -X GET 'http://houston/fundamental/reuters/financials.csv?codes=EIBT&universes=us-banks&start_date=2014-01-01&end_date=2017-01-01' --output financials.csv
$ head financials.csv
CoaCode,ConId,Amount,FiscalYear,FiscalPeriodEndDate,FiscalPeriodType,FiscalPeriodNumber,StatementType,StatementPeriodLength,StatementPeriodUnit,UpdateTypeCode,UpdateTypeDescription,StatementDate,AuditorNameCode,AuditorName,AuditorOpinionCode,AuditorOpinion,Source,SourceDate
EIBT,9029,13.53,2014,2014-12-31,Annual,,INC,12,M,UPD,"Updated Normal",2014-12-31,EY,"Ernst & Young LLP",UNQ,Unqualified,10-K,2015-03-13
EIBT,9029,28.117,2015,2015-12-31,Annual,,INC,12,M,UPD,"Updated Normal",2015-12-31,EY,"Ernst & Young LLP",UNQ,Unqualified,10-K,2016-02-29
EIBT,12190,-4.188,2015,2015-05-31,Annual,,INC,12,M,UPD,"Updated Normal",2015-05-31,CROW,"Crowe Horwath LLP",UNQ,Unqualified,10-K,2015-08-26
EIBT,12190,1.873,2016,2016-05-31,Annual,,INC,12,M,UPD,"Updated Normal",2016-05-31,CROW,"Crowe Horwath LLP",UNQ,Unqualified,10-K,2016-08-05
EIBT,270422,-3.77,2015,2015-09-30,Annual,,INC,12,M,UPD,"Updated Normal",2015-09-30,CROW,"Crowe Horwath LLP",UNQ,Unqualified,10-K,2015-12-18
By default, annual rather than interim statements are returned, and restatements are included. If you prefer, you can choose interim instead of annual statements, and/or you can choose to exclude restatements:
$ quantrocket fundamental financials 'EIBT' -u 'us-banks' -s '2014-01-01' -e '2017-01-01' --interim --exclude-restatements -o interim_financials.csv
$ csvlook -I --max-columns 6 --max-rows 5 financials.csv
| CoaCode | ConId  | Amount | FiscalYear | FiscalPeriodEndDate | FiscalPeriodType | ... |
| ------- | ------ | ------ | ---------- | ------------------- | ---------------- | --- |
| EIBT    | 9029   | 15.386 | 2016       | 2016-06-30          | Interim          | ... |
| EIBT    | 9029   | 8.359  | 2016       | 2016-09-30          | Interim          | ... |
| EIBT    | 12190  | 0.744  | 2017       | 2016-08-31          | Interim          | ... |
| EIBT    | 12190  | -0.595 | 2017       | 2016-11-30          | Interim          | ... |
| EIBT    | 270422 | 1.599  | 2016       | 2016-07-01          | Interim          | ... |
| ...     | ...    | ...    | ...        | ...                 | ...              | ... |
>>> from quantrocket.fundamental import download_reuters_financials
>>> import io
>>> import pandas as pd
>>> f = io.StringIO()
>>> download_reuters_financials(["EIBT"],f,universes=["us-banks"],
                            interim=True,
                            exclude_restatements=True,
                            start_date="2014-01-01", end_date="2017-01-01")
>>> interim_financials = pd.read_csv(f, parse_dates=["SourceDate", "FiscalPeriodEndDate"])
>>> interim_financials.head()
CoaCode   ConId  Amount  FiscalYear FiscalPeriodEndDate FiscalPeriodType  \
0    EIBT    9029   8.359        2016          2016-09-30          Interim
1    EIBT    9029   3.459        2016          2016-12-31          Interim
2    EIBT   12190   0.744        2017          2016-08-31          Interim
3    EIBT   12190  -0.595        2017          2016-11-30          Interim
4    EIBT  270422   1.599        2016          2016-07-01          Interim

 FiscalPeriodNumber StatementType  StatementPeriodLength  \
0                   3           INC                      3
1                   4           INC                      3
2                   1           INC                      3
3                   2           INC                      3
4                   3           INC                      3

StatementPeriodUnit UpdateTypeCode UpdateTypeDescription StatementDate  \
0                   M            UPD        Updated Normal    2016-09-30
1                   M            UCA    Updated Calculated    2016-12-31
2                   M            UPD        Updated Normal    2016-08-31
3                   M            UPD        Updated Normal    2016-11-30
4                   M            UPD        Updated Normal    2016-07-01

AuditorNameCode            AuditorName AuditorOpinionCode AuditorOpinion  \
0             NaN                    NaN                NaN            NaN
1             DHS  Deloitte & Touche LLP                UNQ    Unqualified
2             NaN                    NaN                NaN            NaN
3             NaN                    NaN                NaN            NaN
4             NaN                    NaN                NaN            NaN

Source SourceDate
0   10-Q 2016-11-04
1   10-K 2017-03-03
2   10-Q 2016-10-13
3   10-Q 2017-01-12
4   10-Q 2016-08-10
$ curl -X GET 'http://houston/fundamental/reuters/financials.csv?codes=EIBT&universes=us-banks&interim=True&exclude_restatements=True&start_date=2014-01-01&end_date=2017-01-01' --output interim_financials.csv
$ head interim_financials.csv
CoaCode,ConId,Amount,FiscalYear,FiscalPeriodEndDate,FiscalPeriodType,FiscalPeriodNumber,StatementType,StatementPeriodLength,StatementPeriodUnit,UpdateTypeCode,UpdateTypeDescription,StatementDate,AuditorNameCode,AuditorName,AuditorOpinionCode,AuditorOpinion,Source,SourceDate
EIBT,9029,8.359,2016,2016-09-30,Interim,3,INC,3,M,UPD,"Updated Normal",2016-09-30,,,,,10-Q,2016-11-04
EIBT,9029,3.459,2016,2016-12-31,Interim,4,INC,3,M,UCA,"Updated Calculated",2016-12-31,DHS,"Deloitte & Touche LLP",UNQ,Unqualified,10-K,2017-03-03
EIBT,12190,0.744,2017,2016-08-31,Interim,1,INC,3,M,UPD,"Updated Normal",2016-08-31,,,,,10-Q,2016-10-13
EIBT,12190,-0.595,2017,2016-11-30,Interim,2,INC,3,M,UPD,"Updated Normal",2016-11-30,,,,,10-Q,2017-01-12
EIBT,270422,1.599,2016,2016-07-01,Interim,3,INC,3,M,UPD,"Updated Normal",2016-07-01,,,,,10-Q,2016-08-10

Reuters financials aligned to prices

As with Reuters estimates, you can use a DataFrame of historical prices to get Reuters fundamental data that is aligned to the price data. This makes it easy to perform matrix operations using fundamental data.

First, isolate a particular field of your prices DataFrame. It doesn't matter what field you select, as only the date index and the column names will be used to query the fundamentals. For daily data, use .loc:

>>> from quantrocket.history import get_historical_prices
>>> prices = get_historical_prices("japan-bank-eod", start_date="2017-01-01", fields=["Open","High","Low","Close", "Volume"])
>>> closes = prices.loc["Close"]

For intraday databases, use .loc and .xs to isolate a particular field and time, so that the DataFrame index consists only of dates. Again, the particular field and time don't matter, as only the columns and index will be used:

>>> from quantrocket.history import get_historical_prices
>>> prices = get_historical_prices("japan-bank-15min", start_date="2017-01-01", fields=["Close", "Volume"])
>>> closes = prices.loc["Close"].xs("15:30:00", level="Time")

Now use the DataFrame of prices to get a DataFrame of fundamentals.

>>> # Query total assets (ATOT), total liabilities (LTLL), and common shares
>>> # outstanding (QTCO)
>>> from quantrocket.fundamental import get_reuters_financials_reindexed_like
>>> financials = get_reuters_financials_reindexed_like(
                     closes,
                     coa_codes=["ATOT", "LTLL", "QTCO"])

Similar to historical data, the resulting DataFrame can be thought of as several stacked DataFrames, with a MultiIndex consisting of the COA (Chart of Account) code, the field (by default only Amount is returned), and the date. Note that get_reuters_financials_reindexed_like shifts fundamental values forward by one day to avoid any lookahead bias.

>>> financials.head()
ConId                           4157       4165        4187       4200
CoaCode Field  Date
ATOT    Amount 2018-01-02  21141.294    39769.0  1545.50394   425935.0
               2018-01-03  21141.294    39769.0  1545.50394   425935.0
               2018-01-04  21141.294    39769.0  1545.50394   425935.0
               2018-01-05  21141.294    39769.0  1545.50394   425935.0
               2018-01-08  21141.294    39769.0  1545.50394   425935.0
...
QTCO    Amount 2018-04-03  368.63579      557.0  101.73566  2061.06063
               2018-04-04  368.63579      557.0  101.73566  2061.06063
               2018-04-05  368.63579      557.0  101.73566  2061.06063
               2018-04-06  368.63579      557.0  101.73566  2061.06063
               2018-04-09  368.63579      557.0  101.73566  2061.06063

You can use .loc to isolate particular COA codes and fields and perform matrix operations:

>>> # calculate book value per share
>>> tot_assets = financials.loc["ATOT"].loc["Amount"]
>>> tot_liabilities = financials.loc["LTLL"].loc["Amount"]
>>> shares_out = financials.loc["QTCO"].loc["Amount"]
>>> book_values_per_share = (tot_assets - tot_liabilities)/shares_out

Since the columns and date index match that of the historical data, you can perform matrix operations on prices and fundamentals together:

>>> # calculate price-to-book ratio
>>> pb_ratios = closes/book_values_per_share

Reuters fundamental snippets

Enterprise Multiple (EV/EBITDA)

Enterprise multiple (enterprise value divided by EBITDA) is a popular valuation ratio that is not directly provided by the Reuters datasets. It can be calculated from metrics available in the Reuters financials dataset:

# Formulas:
#
# Enterprise Value:
#
#   EV = market value of common stock + market value of preferred equity + market value of debt + minority interest - cash and investments
#
#   Reuters codes:
#     QTCO: Total Common Shares Outstanding (multiply by price to get market value of common stock)
#     QTPO: Total Preferred Shares Outstanding (multiply by price to get market value of preferred stock)
#     STLD: Total Debt
#     LMIN: Minority Interest
#     ACAE: Cash & Equivalents
#
#   Reuters formula:
#     EV = (price X QTCO) + (price X QTPO) + STLD + LMIN - ACAE
#
# EBITDA
#
#   EBITDA = Operating Profit + Depreciation Expense + Amortization Expense
#
#   Reuters codes:
#     SOPI: Operating Income (= EBIT)
#     SDPR: Depreciation/Amortization
#
#   Reuters formula:
#     EBITDA = SOPI + SDPR

from quantrocket.history import get_historical_prices
from quantrocket.fundamental import get_reuters_financials_reindexed_like

prices = get_historical_prices("usa-stk-eod", fields=["Close"])
closes = prices.loc["Close"]

financials = get_reuters_financials_reindexed_like(
    closes,
    ["QTCO", "QTPO", "STLD", "LMIN", "ACAE", "SOPI", "SDPR"])

# EV
shares_out = financials.loc["QTCO"].loc["Amount"]
preferred_shares_out = financials.loc["QTPO"].loc["Amount"]
total_debts = financials.loc["STLD"].loc["Amount"]
minority_interests = financials.loc["LMIN"].loc["Amount"]
cash = financials.loc["ACAE"].loc["Amount"]

market_values_common = prices * shares_out
market_values_preferred = prices * preferred_shares_out.fillna(0)
evs = market_values_common + market_values_preferred + total_debts + minority_interests.fillna(0) - cash

# EBITDA
operating_profits = financials.loc["SOPI"].loc["Amount"]
depr_amorts = financials.loc["SDPR"].loc["Amount"]
ebitdas = operating_profits + depr_amorts.fillna(0)

enterprise_multiples = evs / ebitdas.where(ebitdas > 0)

Current vs prior fiscal period

Sometimes you may wish to calculate the change in a financial metric between the prior and current fiscal period. For example, suppose you wanted to calculate the change in the working capital ratio (defined as total assets / total liabilities). First, query the financial statements and calculate the current ratios:

from quantrocket.history import get_historical_prices
from quantrocket.fundamental import get_reuters_financials_reindexed_like

prices = get_historical_prices("usa-stk-eod", fields=["Close"])
closes = prices.loc["Close"]

# ATOT = total assets, LTLL = total liabilities
financials = get_reuters_financials_reindexed_like(
    closes,
    ["ATOT", "LTLL"],
    fields=["Amount", "FiscalPeriodEndDate"])

tot_assets = financials.loc["ATOT"].loc["Amount"]
tot_liabilities = financials.loc["LTLL"].loc["Amount"]
current_ratios = tot_assets / tot_liabilities.where(total_liabilities != 0) # avoid DivisionByZero errors

To get the prior year ratios, a simplistic method would be to shift the current ratios forward 1 year (current_ratios.shift(252)), but this would be suboptimal because company reporting dates may not be spaced exactly one year apart. A more reliable approach is shown below:

# get a boolean mask of the first day of each newly reported fiscal
# period
fiscal_periods = financials.loc["ATOT"].loc["FiscalPeriodEndDate"]
is_new_fiscal_periods = fiscal_periods != fiscal_periods.shift()

# shift the ratios forward one fiscal period by (1) shifting the ratios,
# (2) keeping only the ones that fall on the first day of the newly reported
# fiscal period, and (3) forward-filling
prior_ratios = current_ratios.shift().where(is_new_fiscal_periods).fillna(method="ffill")

# Now use the prior and current ratios however desired
ratio_increases = current_ratios > prior_ratios

Shortable shares and borrow fees

New in quantrocket/fundamental:1.3.0

QuantRocket provides current and historical short sale availability data from IB. The dataset includes the number of shortable shares available and the associated borrow fees. You can use this dataset to model the constraints and costs of short selling.

IB updates short sale availability data every 15 minutes. IB does not provide a historical archive of data but QuantRocket maintains a historical archive dating from April 16, 2018.

No IB market data subscriptions are required to access this dataset but you must have the appropriate exchange permissions in QuantRocket.

Collect short sale data

Shortable shares data and borrow fee data are stored separately but have similar APIs. Both datasets are organized by country. The available country names are:

   
australiafrancemexico
austriagermanyspain
belgiumhongkongswedish
britishindiaswiss
canadaitalyusa
dutchjapan

To use the data, first collect the desired dataset and countries from QuantRocket's archive into your local database. For shortable shares:

$ quantrocket fundamental collect-shortshares --countries 'japan' 'usa'
status: the shortable shares will be collected asynchronously
>>> from quantrocket.fundamental import collect_shortable_shares
>>> collect_shortable_shares(countries=["japan","usa"])
{'status': 'the shortable shares will be collected asynchronously'}
$ curl -X POST 'http://houston/fundamental/stockloan/shares?countries=japan&countries=usa'
{"status": "the shortable shares will be collected asynchronously"}
Similarly for borrow fees:
$ quantrocket fundamental collect-shortfees --countries 'japan' 'usa'
status: the borrow fees will be collected asynchronously
>>> from quantrocket.fundamental import collect_borrow_fees
>>> collect_borrow_fees(countries=["japan","usa"])
{'status': 'the borrow fees will be collected asynchronously'}
$ curl -X POST 'http://houston/fundamental/stockloan/fees?countries=japan&countries=usa'
{"status": "the borrow fees will be collected asynchronously"}
You can pass an invalid country such as "?" to either of the above endpoints to see the available country names.

QuantRocket will collect the data in 1-month batches and save it to your database. Monitor flightlog for progress:

2018-07-30 13:40:31 quantrocket.fundamental: INFO Collecting japan shortable shares from 2018-04-01 to present
2018-07-30 13:40:40 quantrocket.fundamental: INFO Collecting usa shortable shares from 2018-04-01 to present
2018-07-30 13:42:07 quantrocket.fundamental: INFO Saved 2993493 total shortable shares records to quantrocket.fundamental.stockloan.shares.sqlite

To update the data later, re-run the same command(s) you ran originally. QuantRocket will collect any new data since your last update and add it to your database.

Short sale data characteristics

Data storage

IB updates short sale availability data every 15 minutes, but the data for any given stock doesn't always change that frequently. To conserve disk space, QuantRocket stores the shortable shares and borrow fees data sparsely. That is, the data for any given security is stored only when the data changes. The following example illustrates:

Timestamp (UTC)Shortable shares reported by IB for ABC stockstored in QuantRocket database
2018-05-01T09:15:0270,900yes
2018-05-01T09:30:0370,900-
2018-05-01T09:45:0270,900-
2018-05-01T10:00:0384,000yes
2018-05-01T10:15:0284,000-

With this data storage design, the data is intended to be forward-filled after you query it. (The functions get_shortable_shares_reindexed_like and get_borrow_fees_reindexed_like do this for you.)

QuantRocket stores the first data point of each month for each stock regardless of whether it changed from the previous data point. This is to ensure that the data is not stored so sparsely that stocks are inadvertently omitted from date range queries. When querying and forward-filling the data you should request an initial 1-month buffer to ensure that infrequently-changing data is included in the query results. For example, if you want results back to June 17, 2018, you should query back to June 1, 2018 or earlier, as this ensures you will get the first-of-month data point for any infrequently changing securities. The functions get_shortable_shares_reindexed_like and get_borrow_fees_reindexed_like take care of this for you.

Missing data

The shortable shares and borrow fees datasets represent IB's comprehensive list of shortable stocks. If stocks are missing from the data, that means they were never available to short. Stocks that were available to short and later became unavailable will be present in the data and will have values of 0 when they became unavailable (possibly followed by nonzero values if they later became available again).

Timestamps and latency

The data timestamps are in UTC and indicate the time at which IB made the data available. It takes approximately two minutes for the data to be processed and made available in QuantRocket's archive. Once available, the data will be added to your local database the next time you collect it.

Stocks with >10M shortable shares

In the shortable shares dataset, 10000000 (10 million) is the largest number reported and means "10 million or more."

Query short sale data

You can export the short sale data to CSV (or JSON), querying by universe or conid:

$ quantrocket fundamental shortshares -u 'usa-stk' -o usa_shortable_shares.csv
$ csvlook -I --max-rows 5 usa_shortable_shares.csv
| ConId | Date                | Quantity |
| ----- | ------------------- | -------- |
| 4027  | 2018-04-15T21:45:02 | 2200000  |
| 4027  | 2018-04-16T13:15:03 | 2300000  |
| 4027  | 2018-04-17T09:15:03 | 2100000  |
| 4027  | 2018-04-17T11:15:02 | 2000000  |
| 4027  | 2018-04-17T11:45:02 | 2100000  |
>>> from quantrocket.fundamental import download_shortable_shares
>>> import pandas as pd
>>> download_shortable_shares(
        "usa_shortable_shares.csv",
        universes=["usa-stk"])
>>> shortable_shares = pd.read_csv(
        "usa_shortable_shares.csv",
        parse_dates=["Date"])
>>> shortable_shares.head()
    ConId               Date  Quantity
0   4027 2018-04-15 21:45:02   2200000
1   4027 2018-04-16 13:15:03   2300000
2   4027 2018-04-17 09:15:03   2100000
3   4027 2018-04-17 11:15:02   2000000
4   4027 2018-04-17 11:45:02   2100000
$ curl -X GET 'http://houston/fundamental/stockloan/shares.csv?&universes=usa-stk' --output usa_shortable_shares.csv
$ head usa_shortable_shares.csv
ConId,Date,Quantity
4027,2018-04-15T21:45:02,2200000
4027,2018-04-16T13:15:03,2300000
4027,2018-04-17T09:15:03,2100000
4027,2018-04-17T11:15:02,2000000
4027,2018-04-17T11:45:02,2100000
Similarly with borrow fees:
$ quantrocket fundamental shortfees -u 'usa-stk' -o usa_borrow_fees.csv
$ csvlook -I --max-rows 5 usa_borrow_fees.csv
| ConId | Date                | FeeRate |
| ----- | ------------------- | ------- |
| 4027  | 2018-04-15T21:45:02 | 0.25    |
| 4027  | 2018-04-24T14:15:02 | 0.262   |
| 4027  | 2018-04-25T14:15:03 | 0.2945  |
| 4027  | 2018-04-26T14:15:03 | 0.2642  |
| 4027  | 2018-04-27T14:15:02 | 0.2609  |
>>> from quantrocket.fundamental import download_borrow_fees
>>> import pandas as pd
>>> download_borrow_fees(
        "usa_borrow_fees.csv",
        universes=["usa-stk"])
>>> borrow_fees = pd.read_csv(
        "usa_borrow_fees.csv",
        parse_dates=["Date"])
>>> borrow_fees.head()
    ConId               Date  FeeRate
0   4027 2018-04-15 21:45:02   0.2500
1   4027 2018-04-24 14:15:02   0.2620
2   4027 2018-04-25 14:15:03   0.2945
3   4027 2018-04-26 14:15:03   0.2642
4   4027 2018-04-27 14:15:02   0.2609
$ curl -X GET 'http://houston/fundamental/stockloan/fees.csv?&universes=usa-stk' --output usa_borrow_fees.csv
$ head usa_borrow_fees.csv
ConId,Date,FeeRate
4027,2018-04-15T21:45:02,0.25
4027,2018-04-24T14:15:02,0.262
4027,2018-04-25T14:15:03,0.2945
4027,2018-04-26T14:15:03,0.2642
4027,2018-04-27T14:15:02,0.2609

Short sale data aligned to prices

As with Reuters fundamentals, you can use a DataFrame of historical prices to get shortable shares or borrow fees data that is aligned to the price data.

First, isolate a particular field of your prices DataFrame. It doesn't matter what field you select, as only the date index and the column names will be used to query the short sale data. For daily data, use .loc:

>>> from quantrocket.history import get_historical_prices
>>> prices = get_historical_prices("usa-stk-1d", start_date="2018-04-16", fields=["Open","Close", "Volume"])
>>> closes = prices.loc["Close"]

For intraday databases, use .loc and .xs to isolate a particular field and time, so that the DataFrame index consists only of dates. Again, the particular field and time don't matter, as only the columns and index will be used:

>>> from quantrocket.history import get_historical_prices
>>> prices = get_historical_prices("usa-stk-15min", start_date="2018-04-16", fields=["Close", "Volume"], times="15:30:00")
>>> closes = prices.loc["Close"].xs("15:30:00", level="Time")

Now use the DataFrame of prices to get a DataFrame of shortable shares and/or borrow fees:

>>> from quantrocket.fundamental import get_shortable_shares_reindexed_like, get_borrow_fees_reindexed_like
>>> shortable_shares = get_shortable_shares_reindexed_like(closes)
>>> borrow_fees = get_borrow_fees_reindexed_like(closes)

The resulting DataFrame has a DatetimeIndex matching the input DataFrame:

>>> shortable_shares.head()
ConId       4027       4050       4065         4151       \
Date
2018-04-16  2200000.0  3000000.0  10000000.0   550000.0
2018-04-17  2300000.0  2900000.0  10000000.0   600000.0
2018-04-18  2100000.0  3000000.0  10000000.0   550000.0
2018-04-19  2100000.0  3000000.0  10000000.0   550000.0
2018-04-20  2100000.0  3000000.0  10000000.0   550000.0

The data for each date is as of midnight UTC. You can specify a different time and timezone using the time parameter:

>>> # request shortable shares as of the US market open
>>> shortable_shares = get_shortable_shares_reindexed_like(closes, time="09:30:00 America/New_York")
>>> # request borrow fees as of 5:00 PM New York time
>>> borrow_fees = get_borrow_fees_reindexed_like(closes, time="17:00:00 America/New_York")

Dates prior to April 16, 2018 (the start date of QuantRocket's historical archive) will have NaNs in the resulting DataFrame.

Borrow fees are stored as annualized interest rates. For example, 1.0198 indicates an annualized interest rate of 1.0198%:

>>> borrow_fees.head()
ConId          4187       4200       4205       4211  \
Date
2018-04-16     1.0198     0.7224     0.5954     0.2500
2018-04-17     0.5023     0.5912     0.5954     0.2500
2018-04-18     0.6257     0.5925     0.5943     0.8844
2018-04-19     0.9537     0.5946     0.6463     0.8844
2018-04-20     1.6422     0.5936     0.3096     0.6476

Below is an example of calculating borrow fees for a DataFrame of positions (adapted from Moonshot's BorrowFees slippage class):

borrow_fees = get_borrow_fees_reindexed_like(positions)
borrow_fees = borrow_fees / 100
# Fees are assessed daily but the dataframe is expected to only
# includes trading days, thus use 252 instead of 365. In reality the
# borrow fee is greater for weekend positions than weekday positions,
# but this implementation doesn't model that.
daily_borrow_fees = borrow_fees / 252
assessed_fees = positions.where(positions < 0, 0).abs() * daily_borrow_fees

Research

Once you've collected some historical data, you're ready to open a research notebook in JupyterLab and start analyzing your ideas. QuantRocket makes it easy to work with historical and fundamental data using Pandas. You can analyze your alpha factors using Alphalens. Once you're ready to backtest, your research code readily transfers to Moonshot since Moonshot is Pandas-based.

Why research notebooks?

The workflow of many quants includes a research stage prior to backtesting. The purpose of a separate research stage is to rapidly test ideas in a preliminary manner to see if they're worth the effort of a full-scale backtest. The research stage typically ignores transaction costs, liquidity constraints, and other real-world challenges that traders face and that backtests try to simulate. Thus, the research stage constitutes a "first cut": promising ideas advance to the more stringent simulations of backtesting, while unpromising ideas are discarded.

Jupyter notebooks provide Python quants with an excellent tool for ad-hoc research. Jupyter notebooks let you write code to crunch your data, run visualizations, and make sense of the results with narrative commentary.

Alphalens

Alphalens is an open source library created by Quantopian for analyzing alpha factors. You can use Alphalens early in your research process to determine if your ideas look promising.

For example, suppose you wanted to analyze the momentum factor, which says that recent winners tend to outperform recent losers. First, load your historical data and extract the closing prices:

>>> prices = get_historical_prices("demo-stocks-1d", start_date="2010-01-01", fields=["Close"])
>>> closes = prices.loc["Close"]

Next, calculate the 12-month returns, skipping the most recent month (as commonly prescribed in academic papers about the momentum factor):

>>> MOMENTUM_WINDOW = 252 # 12 months = 252 trading days
>>> RANKING_PERIOD_GAP = 22 # 1 month = 22 trading days
>>> earlier_closes = closes.shift(MOMENTUM_WINDOW)
>>> later_closes = closes.shift(RANKING_PERIOD_GAP)
>>> returns = (later_closes - earlier_closes) / earlier_closes

The 12-month returns are the predictive factor we will pass to Alphalens, along with pricing data so Alphalens can see whether the factor was in fact predictive. To avoid lookahead bias, in this example we should shift() our factor forward one period to align it with the subsequent prices, since the subsequent prices would represent our entry prices after calculating the factor. Alphalens expects the predictive factor to be stacked into a MultiIndex Series, while pricing data should be a DataFrame:

>>> # shift factor to avoid lookahead bias
>>> returns = returns.shift()
>>> # stack as expected by Alphalens
>>> returns = returns.stack()
>>> factor_data = alphalens.utils.get_clean_factor_and_forward_returns(returns, closes)
>>> alphalens.tears.create_returns_tear_sheet(factor_data)

You'll see tabular statistics as well as graphs that look something like this:

Alphalens tearsheet

Code reuse in Jupyter

If you find yourself writing the same code again and again, you can factor it out into a .py file in Jupyter and import it into your notebooks and algo files. Any .py files in or under the /codeload directory inside Jupyter (that is, in or under the top-level directory visible in the Jupyter file browser) can be imported using standard Python import syntax. For example, suppose you've implemented a function in /codeload/research/utils.py called analyze_fundamentals. You can import and use the function in another file or notebook:

from codeload.research.utils import analyze_fundamentals

The .py files can live wherever you like in the directory tree; subdirectories can be reached using standard Python dot syntax.

To make your code importable as a standard Python package, the 'codeload' directory and each subdirectory must contain a __init__.py file. QuantRocket will create these files automatically if they don't exist.

QGrid

New in quantrocket/jupyter:1.3.0

QGrid is a Jupyter notebook extension created by Quantopian that provides Excel-like sorting and filtering of DataFrames in Jupyter notebooks. You can use it to explore a DataFrame interactively without writing code. A basic example is shown below:

from quantrocket.history import get_historical_prices
import qgrid

# Load prices (or any other DataFrame)
prices = get_historical_prices("usa-stk-1d")

# A wide DataFrame with columns for each security will be too wide for the screen
# so reshape to put the fields as columns instead
prices = prices.stack().unstack("Field")

# Construct and display the grid
widget = qgrid.show_grid(prices)
widget

You'll see a grid like this:

QGrid widget

After filtering the grid, you can get the edited DataFrame:

prices_edited = widget.get_changed_df()

Moonshot

Moonshot is a fast, vectorized Pandas-based backtester that supports daily or intraday data, multi-strategy backtests and parameter scans, and live trading. It is well-suited for running cross-sectional strategies or screens involving hundreds or even thousands of securities.

Check out the FAQ for an introductory overview of Moonshot.

What is Moonshot?

What is a vectorized backtester?

Moonshot is a vectorized backtester. What's the difference between event-driven backtesters like Zipline and vectorized backtesters like Moonshot? Event-driven backtests process one event at a time, where an event is usually one historical bar (or in the case of live trading, one real-time quote). Vectorized backtests process all events at once, by performing simultaneous calculations on an entire vector or matrix of data. (In pandas, a Series is a vector and a DataFrame is a matrix).

Imagine a simplistic strategy of buying a security whenever the price falls below $10 and selling whenever it rises above $10. We have a time series of prices and want to know which days to buy and which days to sell. In an event-driven backtester we loop through one date at a time and check the price at each iteration:

>>> data = {
>>>     "2017-02-01": 10.07,
>>>     "2017-02-02": 9.87,
>>>     "2017-02-03": 9.91,
>>>     "2017-02-04": 10.01
>>> }
>>> for date, price in data.items():
>>>     if price < 10:
>>>         buy_signal = True
>>>     else:
>>>         buy_signal = False
>>>     print(date, buy_signal)
2017-02-01 False
2017-02-02 True
2017-02-03 True
2017-02-04 False

In a vectorized backtest, we check all the prices at once to calculate our buy signals:

>>> import pandas as pd
>>> data = {
>>>     "2017-02-01": 10.07,
>>>     "2017-02-02": 9.87,
>>>     "2017-02-03": 9.91,
>>>     "2017-02-04": 10.01
>>> }
>>> prices = pd.Series(data)
>>> buy_signals = prices < 10
>>> buy_signals.head()
2017-02-01    False
2017-02-02     True
2017-02-03     True
2017-02-04    False
dtype: bool

Both backtests produce the same result but use a different approach.

Vectorized backtests are faster than event-driven backtests

Speed is one of the principal benefits of vectorized backtests, thanks to running calculations on an entire time series at once. Event-driven backtests can be prohibitively slow when working with large universes of securities and large amounts of data. Because of their speed, vectorized backtesters support rapid experimentation and testing of new ideas.

Watch out for look-ahead bias with vectorized backtesters

Look-ahead bias refers to making decisions in your backtest based on information that wouldn't have been available at the time of the trade. Because event-driven backtesters only give you one bar at a time, they generally protect you from look-ahead bias. Because a vectorized backtester gives you the entire time-series, it's easier to introduce look-ahead bias by mistake, for example generating signals based on today's close but then calculating the return from today's open instead of tomorrow's.

If you achieve a phenomenal backtest result on the first try with a vectorized backtester, check for look-ahead bias.

How does live trading work?

With event-driven backtesters, switching from backtesting to live trading typically involves changing out a historical data feed for a real-time market data feed, and replacing a simulated broker with a real broker connection.

With a vectorized backtester, live trading can be achieved by running an up-to-the-moment backtest and using the final row of signals (that is, today's signals) to generate orders.

Supported types of strategies

The vectorized design of Moonshot is well-suited for cross-sectional and factor-model strategies with regular rebalancing intervals, or for any strategy that "wakes up" at a particular time, checks current and historical market conditions, and makes trading decisions accordingly.

Examples of supported strategies:

  • End-of-day strategies
  • Intraday strategies that run at most once per day at a particular time of day
  • Cross-sectional and factor-model strategies
  • Market neutral strategies
  • Seasonal strategies (where "seasonal" might be time of year, day of month, day of week, or time of day)
  • Strategies that use fundamental data
  • Strategies that screen thousands of stocks using daily data
  • Strategies that screen thousands of stocks using 15- or 30-minute intraday data
  • Strategies that screen a few hundred stocks using 5-minute intraday data
  • Strategies that screen a few stocks using 1-minute intraday data

Examples of unsupported strategies:

  • Strategies that trade in and out of positions throughout the day (however, you can have multiple separate strategies, each trading at a particular time of day)
  • Scalping or market-making strategies
  • Strategies that continuously monitor a data feed rather than being scheduled to run at particular times
  • Low latency strategies
  • Path-dependent strategies that don't lend themselves to the vectorized design

Backtesting

Backtesting quickstart

Let's design a dual moving average strategy which buys tech stocks when their short moving average is above their long moving average. Assume we've already created a history database of daily bars for several tech stocks, like so:

$ # get the tech stock listings...
$ quantrocket master listings --exchange 'NASDAQ' --symbols 'GOOGL' 'NFLX' 'AAPL' 'AMZN'
status: the listing details will be collected asynchronously
$ # monitor flightlog for listing details to be collected, then make a universe:
$ quantrocket master get -e 'NASDAQ' -s 'GOOGL' 'NFLX' 'AAPL' 'AMZN' | quantrocket master universe 'tech-giants' -f -
code: tech-giants
inserted: 4
provided: 4
total_after_insert: 4
$ # get 1 day bars for the stocks
$ quantrocket history create-db 'tech-giants-1d' -u 'tech-giants' --bar-size '1 day'
status: successfully created quantrocket.history.tech-giants-1d.sqlite
$ quantrocket history collect 'tech-giants-1d'
status: the historical data will be collected asynchronously

Now let's write the minimal strategy code to run a backtest:

from moonshot import Moonshot

class DualMovingAverageStrategy(Moonshot):

    CODE = "dma-tech"
    DB = "tech-giants-1d"
    LMAVG_WINDOW = 300
    SMAVG_WINDOW = 100

    def prices_to_signals(self, prices):
        closes = prices.loc["Close"]

        # Compute long and short moving averages
        lmavgs = closes.rolling(self.LMAVG_WINDOW).mean()
        smavgs = closes.rolling(self.SMAVG_WINDOW).mean()

        # Go long when short moving average is above long moving average
        signals = smavgs.shift() > lmavgs.shift()

        return signals.astype(int)

A strategy is a subclass of the Moonshot class. You implement your trading logic in the class methods and store your strategy parameters as class attributes. Class attributes include built-in Moonshot parameters which you can specify or override, as well as your own custom parameters. In the above example, CODE and DB are built-in parameters while LMAVG_WINDOW and SMAVG_WINDOW are custom parameters which we've chosen to store as class attributes, which will allow us to run parameter scans or create similar strategies with different parameters.

Place your code in a file inside the 'moonshot' directory in JupyterLab. QuantRocket recursively scans .py files in this directory and loads your strategies.

You can run backtests via the command line or inside a Jupyter notebook, and you can get back a CSV of backtest results or a tear sheet with performance plots.

$ quantrocket moonshot backtest 'dma-tech' -s '2005-01-01' -e '2017-01-01' --pdf -o dma_tech_tearsheet.pdf --details
>>> from quantrocket.moonshot import backtest
>>> import moonchart
>>> backtest("dma-tech", start_date="2005-01-01", end_date="2017-01-01",
             filepath_or_buffer="dma_tech.csv")
>>> t = moonchart.Tearsheet()
>>> t.from_moonshot_csv("dma_tech.csv")
$ curl -X POST 'http://houston/moonshot/backtests?strategies=dma-tech&start_date=2005-01-01&end_date=2017-01-01&pdf=true' > dma_tech_tearsheet.pdf

The performance plots will resemble the following:

moonshot tearsheet

Backtest visualization and analysis in Jupyter

In addition to running backtests from the CLI, you can run backtests from a Jupyter notebook and perform analysis and visualizations inside the notebook. First, run the backtest and save the results to a CSV:

>>> from quantrocket.moonshot import backtest
>>> backtest("dma-tech", start_date="2005-01-01", end_date="2017-01-01",
        filepath_or_buffer="dma_tech_results.csv")

You can do three main things with the CSV results:

  1. generate a performance tear sheet using Moonchart, an open source companion library to Moonshot;
  2. generate a performance tear sheet using pyfolio, an open source library created by Quantopian; and
  3. load the results into a Pandas DataFrame for further analysis.

To look at a Moonchart tear sheet:

>>> import moonchart
>>> t = moonchart.Tearsheet()
>>> t.from_moonshot_csv("dma_tech_results.csv")

To look at a pyfolio tear sheet:

>>> import pyfolio as pf
>>> pf.from_moonshot_csv("dma_tech_results.csv")

Moonchart and pyfolio offer different visualizations so it's nice to look at both.

You can also load the results into a DataFrame:

>>> import pandas as pd
>>> results = pd.read_csv("dma_tech_results.csv", parse_dates=["Date"], index_col=["Field","Date"])
>>> results.tail()
                   AAPL(265598)  AMZN(3691937)  NFLX(15124833)  GOOGL(208813719)
Field  Date
Weight 2016-12-23          0.25           0.25            0.25              0.25
       2016-12-27          0.25           0.25            0.25              0.25
       2016-12-28          0.25           0.25            0.25              0.25
       2016-12-29          0.25           0.25            0.25              0.25
       2016-12-30          0.25           0.25            0.25              0.25

The DataFrame consists of several stacked DataFrames, one DataFrame per field (see backtest field reference). Use .loc to isolate a particular field:

>>> returns = results.loc["Return"]
>>> returns.tail()
            AAPL(265598)  AMZN(3691937)  NFLX(15124833)  GOOGL(208813719)
Date
2016-12-23      0.000494      -0.001876        0.000020         -0.000580
2016-12-27      0.001588       0.003553        0.005494          0.000659
2016-12-28     -0.001066       0.000237       -0.004792         -0.001654
2016-12-29     -0.000064      -0.002260       -0.001112         -0.000525
2016-12-30     -0.001949      -0.004992       -0.003052         -0.003248

Since we specified details=True when running the backtest, there is a column per security. Had we omitted details=True, or if we were running a multi-strategy backtest, there would be a column per strategy.

How a Moonshot backtest works

Moonshot is all about DataFrames. In a Moonshot backtest, we start with a DataFrame of historical prices and derive a variety of equivalently-indexed DataFrames, including DataFrames of signals, trade allocations, positions, and returns. These DataFrames consist of a time-series index (vertical axis) with one or more securities as columns (horizontal axis). A simple example of a DataFrame of signals is shown below for a strategy with a 2-security universe (securities are identified by conid):

ConId       12345  67890
Date
2017-09-19      0     -1
2017-09-20      1     -1
2017-09-21      1      0

A Moonshot strategy consists of strategy parameters (stored as class attributes) and strategy logic (implemented in class methods). The strategy logic required to run a backtest is spread across four main methods, mirroring the stages of a trade:

method nameinput/output
what direction to trade?prices_to_signalsfrom a DataFrame of prices, return a DataFrame of integer signals, where 1=long, -1=short, and 0=cash
how much capital to allocate to the trades?signals_to_target_weightsfrom a DataFrame of integer signals (-1, 0, 1), return a DataFrame indicating how much capital to allocate to the signals, expressed as a percentage of the total capital allocated to the strategy (for example, -0.25, 0, 0.1 to indicate 25% short, cash, 10% long)
enter the positions when?target_weights_to_positionsfrom a DataFrame of target weights, return a DataFrame of positions (here we model the delay between when the signal occurs and when the position is entered, and possibly model non-fills)
what's our return?positions_to_gross_returnsfrom a DataFrame of positions and a DataFrame of prices, return a DataFrame of percentage returns before commissions and slippage (our return is the security's percent change over the period, multiplied by the size of the position)

Since Moonshot is a vectorized backtester, each of these methods is called only once per backtest.

Our demo strategy above relies on the default implementations of several of these methods, but since it's better to be explicit than implicit, you should always implement these methods even if you copy the default behavior. Let's explicitly implement the default behavior in our demo strategy:

from moonshot import Moonshot

class DualMovingAverageStrategy(Moonshot):

    CODE = "dma-tech"
    DB = "tech-giants-1d"
    LMAVG_WINDOW = 300
    SMAVG_WINDOW = 100

    def prices_to_signals(self, prices):
        closes = prices.loc["Close"]

        # Compute long and short moving averages
        lmavgs = closes.rolling(self.LMAVG_WINDOW).mean()
        smavgs = closes.rolling(self.SMAVG_WINDOW).mean()

        # Go long when short moving average is above long moving average
        signals = smavgs.shift() > lmavgs.shift()

        return signals.astype(int)

    def signals_to_target_weights(self, signals, prices):
        # spread our capital equally among our trades on any given day
        weights = self.allocate_equal_weights(signals) # provided by moonshot.mixins.WeightAllocationMixin
        return weights

    def target_weights_to_positions(self, weights, prices):
        # we'll enter in the period after the signal
        positions = weights.shift()
        return positions

    def positions_to_gross_returns(self, positions, prices):
        # Our return is the security's close-to-close return, multiplied by
        # the size of our position. We must shift the positions DataFrame because
        # we don't have a return until the period after we open the position
        closes = prices.loc["Close"]
        gross_returns = closes.pct_change() * positions.shift()
        return gross_returns

To summarize the above code, we generate signals based on moving average crossovers, we divide our capital equally among the securities with signals, we enter the positions the next day, and compute our (gross) returns using the securities' close-to-close returns.

Several weight allocation algorithms are provided out of the box via moonshot.mixins.WeightAllocationMixin.

Benchmarks

Optionally, we can identify a security within our strategy universe as a benchmark, and we'll get a chart of our strategy's performance against the benchmark. Our ETF strategy universe includes SPY, so let's make that our benchmark. First, lookup the conid (contract ID) if needed, since that's how we specify the benchmark:

$ quantrocket master get -e ARCA -s SPY -f ConId -p
ConId = 756733

Now set this conid as the benchmark:

class DualMovingAverageStrategyETF(DualMovingAverageStrategy):

    CODE = "dma-etf"
    DB = "etf-sampler-1d"
    LMAVG_WINDOW = 300
    SMAVG_WINDOW = 100
    BENCHMARK = 756733 # Must exist with the strategy DB

Run the backtest again, and we'll see an additional chart in our tear sheet:

moonshot tearsheet vs benchmark

Multi-strategy backtests

We can easily backtest multiple strategies at once to simulate running complex portfolios of strategies. Simply specify all of the strategies:

$ quantrocket moonshot backtest 'dma-tech' 'dma-etf' -s '2005-01-01' -e '2017-01-01' --pdf -o dma_multistrat.pdf
>>> from quantrocket.moonshot import backtest
>>> import moonchart
>>> backtest(["dma-tech", "dma-etf"], start_date="2005-01-01", end_date="2017-01-01",
             filepath_or_buffer="dma_multistrat.csv")
>>> t = moonchart.Tearsheet()
>>> t.from_moonshot_csv("dma_multistrat.csv")
$ curl -X POST 'http://houston/moonshot/backtests?strategies=dma-etf&strategies=dma-tech&start_date=2005-01-01&end_date=2017-01-01&pdf=true' > dma_multistrat.pdf

Our tear sheet will show the aggregate portfolio performance as well as the individual strategy performance:

moonshot multi-strategy tearsheet

By default, when backtesting multiple strategies, capital is divided equally among the strategies; that is, each strategy's allocation is 1.0 / number of strategies. If this isn't what you want, you can specify custom allocations for each strategy (which need not add up to 1):

$ # allocate 125% of capital to dma-tech and another 25% to dma-etf
$ quantrocket moonshot backtest 'dma-tech' 'dma-etf' --allocations 'dma-tech:1.25' 'dma-etf:0.25' -s '2005-01-01' -e '2017-01-01' --pdf -o dma_multistrat.pdf
>>> from quantrocket.moonshot import backtest
>>> # allocate 125% of capital to dma-tech and another 25% to dma-etf
>>> backtest(["dma-tech", "dma-etf"],
             allocations={"dma-tech": 1.25, "dma-etf": 0.25},
             start_date="2005-01-01", end_date="2017-01-01",
             filepath_or_buffer="dma_multistrat.csv")
$ # allocate 125% of capital to dma-tech and another 25% to dma-etf
$ curl -X POST 'http://houston/moonshot/backtests?strategies=dma-etf&strategies=dma-tech&start_date=2005-01-01&end_date=2017-01-01&allocations=dma-tech%3A1.25&allocations=dma-etf%3A0.25&pdf=true' > dma_multistrat.pdf

On-the-fly parameters

You can change Moonshot parameters on-the-fly from the Python client or CLI when running backtests, without having to edit your .py algo files. Pass parameters as KEY:VALUE pairs:

$ # disable commissions for this backtest
$ quantrocket moonshot backtest 'dma-tech' -o dma_tech_no_commissions.csv --params 'COMMISSION_CLASS:None'
>>> # disable commissions for this backtest
>>> backtest("dma-tech", filepath_or_buffer="dma_tech_no_commissions.csv",
             params={"COMMISSION_CLASS":None})
$ # disable commissions for this backtest
$ curl -X POST 'http://houston/moonshot/backtests?strategies=dma-tech&params=COMMISSION_CLASS%3ANone' > dma_tech_no_commissions.csv
This capability is provided as a convenience and also helps protect you from temporarily editing your algo file and forgetting to change it back. It is also available for parameter scans:
$ # add slippage for this parameter scan
$ quantrocket moonshot paramscan 'dma-tech' -p 'SMAVG_WINDOW' -v 5 20 100 --params 'SLIPPAGE_BPS:2' -o dma_tech_1d_with_slippage.csv
>>> # add slippage for this parameter scan
>>> from quantrocket.moonshot import scan_parameters
>>> scan_parameters("dma-tech",
                    param1="SMAVG_WINDOW", vals1=[5,20,100],
                    params={"SLIPPAGE_BPS":2},
                    filepath_or_buffer="dma_tech_1d_with_slippage.csv")
$ # add slippage for this parameter scan
$ curl -X POST 'http://houston/moonshot/paramscans?strategies=dma-tech&param1=SMAVG_WINDOW&vals1=5&vals1=20&vals1=100&SLIPPAGE_BPS%3A2' > dma_tech_1d_with_slippage.csv

Lookback windows

Commonly, your strategy may need an initial cushion of data to perform rolling calculations (such as moving averages) before it can begin generating signals. By default, Moonshot will infer the required cushion size by using the largest integer value of any strategy attribute whose name ends with _WINDOW. In the following example, the lookback window will be set to 200:

class DualMovingAverage(Moonshot):

    ...
    SMAVG_WINDOW = 50
    LMAVG_WINDOW = 200

Moonshot will load an additional 200 trading days of historical data (plus a small additional buffer) prior to your backtest start date so that your signals can actually begin on the start date. If there are no _WINDOW attributes, the cushion defaults to 252 (approx. 1 year). Or you can set it explicitly with the LOOKBACK_WINDOW attribute (set to 0 to disable):

class StrategyWithQuarterlyLookback(Moonshot):

    ...
    LOOKBACK_WINDOW = 63
If you make a habit of storing rolling window lengths as class attributes ending with _WINDOW, the lookback window will usually take care of itself and you shouldn't need to worry about it.
Adequate lookback windows are especially important for live trading. In case you don't name your rolling window attributes with _WINDOW, make sure to define a LOOKBACK_WINDOW that is adequate for your strategy's rolling calculations, as an inadequate lookback window will mean your strategy doesn't load enough data in live trading and therefore never generates any trades.

Segmented backtests

New in quantrocket/moonshot:1.2.2, quantrocket/jupyter:1.2.2

When running a backtest on a large universe and sizable date range, you might run out of memory. As explained in more detail in the Troubleshooting section, you'll see an error like this:

$ quantrocket moonshot backtest 'big-boy' --start-date '2000-01-01'
msg: 'HTTPError(''502 Server Error: Bad Gateway for url: http://houston/moonshot/backtests?strategies=big-boy&start_date=2000-01-01'',
  ''please check the logs for more details'')'
status: error

And in the logs you'll find this:

$ quantrocket flightlog stream --hist 1
2017-10-02 19:29:32 quantrocket.moonshot: ERROR the system killed the worker handling the request, likely an Out Of Memory error; \if you were backtesting, try a segmented backtest to reduce memory usage (for example `segment="A"`), or add more memory

When this happens, you can try a segmented backtest. In a segmented backtest, QuantRocket breaks the backtest date range into smaller segments (for example, 1-year segments), runs each segment of the backtest in succession, and concatenates the partial results into a single backtest result. The output is identical to a non-segmented backtest, but the memory footprint is smaller. The segment option takes a Pandas frequency string specifying the desired size of the segments, for example "A" for annual segments, "Q" for quarterly segments, or "2A" for 2-year segments:

$ quantrocket moonshot backtest 'big-boy' -s '2000-01-01' -e '2018-01-01' --segment 'A' -o backtest_result.csv
>>> from quantrocket.moonshot import backtest
>>> backtest("big-boy", start_date="2001-01-01", end_date="2018-01-01", segment="A", filepath_or_buffer="backtest_result.csv")
$ curl -X POST 'http://houston/moonshot/backtests.csv?strategies=big-boy&start_date=2001-01-01&end_date=2018-01-01&segment=A'
Providing a start and end date is optional for a non-segmented backtest but required for a segmented backtest.

In the detailed logs, you'll see Moonshot running through each backtest segment:

$ quantrocket flightlog stream -d
quantrocket_moonshot_1|[big-boy] Backtesting strategy from 2001-01-01 to 2001-12-30
quantrocket_moonshot_1|[big-boy] Backtesting strategy from 2001-12-31 to 2002-12-30
quantrocket_moonshot_1|[big-boy] Backtesting strategy from 2002-12-31 to 2003-12-30
quantrocket_moonshot_1|[big-boy] Backtesting strategy from 2003-12-31 to 2004-12-30
quantrocket_moonshot_1|[big-boy] Backtesting strategy from 2004-12-31 to 2005-12-30
...

Backtest field reference

Backtest result CSVs contain the following fields in a stacked format. Each field is a DataFrame from the backtest. For detailed backtests, there is a column per security. For non-detailed or multi-strategy backtests, there is a column per strategy, with each column containing the aggregated (summed) results of all securities in the strategy.

  • Signal: the signals returned by prices_to_signals.
  • NetExposure: the net long or short positions returned by target_weights_to_positions. Expressed as a decimal percentage of capital base.
  • AbsExposure: the absolute value of positions, irrespective of their side (long or short). Expressed as a decimal percentage of capital base. This represents the total market exposure of the strategy.
  • Weight: the target weights allocated to the strategy, after multiplying by strategy allocation and applying any weight constraints. Expressed as a decimal percentage of capital base.
  • AbsWeight: the absolute value of the target weights.
  • Trade: the strategy's day-to-day turnover. Expressed as a decimal percentage of capital base.
  • Return: the returns, after commissions and slippage. Expressed as a decimal percentage of capital base.
  • Commission: the commissions deducted from gross returns. Expressed as a decimal percentage of capital base.
  • Slippage: the slippage deducted from gross returns. Expressed as a decimal percentage of capital base.
  • Benchmark: the prices of the benchmark security, if any.

Parameter scans

You can run 1-dimensional or 2-dimensional parameter scans to see how your strategy performs for a variety of parameter values. You can run parameter scans against any parameter which is stored as a class attribute on your strategy (or as a class attribute on a parent class of your strategy).

For example, returning to the moving average crossover example, recall that the long and short moving average windows are stored as class attributes:

class DualMovingAverageStrategy(Moonshot):

    CODE = "dma-tech"
    DB = "tech-giants-1d"
    LMAVG_WINDOW = 300
    SMAVG_WINDOW = 100

Let's try varying the short moving average window on our dual moving average strategy:

$ quantrocket moonshot paramscan 'dma-tech' -p 'SMAVG_WINDOW' -v 5 20 100 -s '2005-01-01' -e '2017-01-01' --pdf -o dma_1d.pdf
>>> from quantrocket.moonshot import scan_parameters
>>> import moonchart
>>> scan_parameters("dma-tech", start_date="2005-01-01", end_date="2017-01-01",
                    param1="SMAVG_WINDOW", vals1=[5,20,100],
                    filepath_or_buffer="dma_tech_1d.csv")
>>> # Note the use of ParamscanTearsheet rather than Tearsheet
>>> t = moonchart.ParamscanTearsheet()
>>> t.from_moonshot_csv("dma_tech_1d.csv")
$ curl -X POST 'http://houston/moonshot/paramscans?strategies=dma-tech&start_date=2005-01-01&end_date=2017-01-01&param1=SMAVG_WINDOW&vals1=5&vals1=20&vals1=100&pdf=true' > dma_tech_1d.pdf

The resulting tear sheet will show how the strategy performs for each parameter value:

moonshot paramscan 1-D tearsheet

Let's try a 2-dimensional parameter scan, varying both our short and long moving averages:

$ quantrocket moonshot paramscan 'dma-tech' --param1 'SMAVG_WINDOW' --vals1 5 20 100 --param2 'LMAVG_WINDOW' --vals2 150 200 300 -s '2005-01-01' -e '2017-01-01' --pdf -o dma_2d.pdf
>>> from quantrocket.moonshot import scan_parameters
import moonchart
>>> scan_parameters("dma-tech", start_date="2005-01-01", end_date="2017-01-01",
                    param1="SMAVG_WINDOW", vals1=[5,20,100],
                    param2="LMAVG_WINDOW", vals2=[150,200,300],
                    filepath_or_buffer="dma_tech_2d.csv")
>>> t = moonchart.ParamscanTearsheet()
>>> t.from_moonshot_csv("dma_tech_2d.csv")
$ curl -X POST 'http://houston/moonshot/paramscans?strategies=dma-tech&start_date=2005-01-01&end_date=2017-01-01&param1=SMAVG_WINDOW&vals1=5&vals1=20&vals1=100&param2=LMAVG_WINDOW&vals2=150&vals2=200&vals2=300&pdf=true' > dma_tech_2d.pdf

This time our tear sheet uses a heat map to visualize the 2-D results:

moonshot paramscan 2-D tearsheet

We can even run a 1-D or 2-D parameter scan on multiple strategies at once:

$ quantrocket moonshot paramscan 'dma-tech' 'dma-etf' -p 'SMAVG_WINDOW' -v 5 20 100 -s '2005-01-01' -e '2017-01-01' --pdf -o dma_multistrat_1d.pdf
>>> from quantrocket.moonshot import scan_parameters
>>> import moonchart
>>> scan_parameters(["dma-tech","dma-etf"], start_date="2005-01-01", end_date="2017-01-01",
                    param1="SMAVG_WINDOW", vals1=[5,20,100],
                    filepath_or_buffer="dma_multistrat_1d.csv")
>>> t = moonchart.ParamscanTearsheet()
>>> t.from_moonshot_csv("dma_multistrat_1d.csv")
$ curl -X POST 'http://houston/moonshot/paramscans?strategies=dma-tech&strategies=dma-etf&start_date=2005-01-01&end_date=2017-01-01&param1=SMAVG_WINDOW&vals1=5&vals1=20&vals1=100&pdf=true' > dma_multistrat_1d.pdf

The tear sheet shows the scan results for the individual strategies and the aggregate portfolio:

moonshot paramscan multi-strategy 1-D tearsheet

Often when first coding a strategy your parameter values will be hardcoded in the body of your methods:

class TrendDay(Moonshot):
    ...
    def prices_to_signals(self, prices):
        ...
        afternoon_prices = closes.xs("14:00:00", level="Time")
        ...

When you're ready to run parameter scans, simply factor out the hardcoded values into class attributes, naming the attribute whatever you like:

class TrendDay(Moonshot):
    ...
    DECISION_TIME = "14:00:00"

    def prices_to_signals(self, prices):
        ...
        afternoon_prices = closes.xs(self.DECISION_TIME, level="Time")
        ...

Now run your parameter scan:

$ quantrocket moonshot paramscan 'trend-day' -p 'DECISION_TIME' -v '14:00:00' '14:15:00' '14:30:00' --pdf -o trend_day_afternoon_time_scan.pdf
>>> from quantrocket.moonshot import scan_parameters
>>> import moonchart
>>> scan_parameters("trend-day",
                    param1="DECISION_TIME", vals1=["14:00:00", "14:15:00", "14:30:00"],
                    filepath_or_buffer="trend_day_afternoon_time_scan.csv")
>>> t = moonchart.ParamscanTearsheet()
>>> t.from_moonshot_csv("trend_day_afternoon_time_scan.csv")
$ curl -X POST 'http://houston/moonshot/paramscans?strategies=trend-day&param1=DECISION_TIME&vals1=14%3A00%3A00&vals1=14%3A15%3A00&vals1=14%3A30%3A00&pdf=true' > trend_day_afternoon_time_scan.pdf

You can scan parameter values other than just strings or numbers, including True, False, None, and lists of values. You can pass the special value "default" to run an iteration that preserves the parameter value already defined on your strategy.

$ quantrocket moonshot paramscan 'dma-tech' --param1 'SLIPPAGE_BPS' --vals1 'default' 'None' '2' '5' --param2 'EXCLUDE_CONIDS' --vals2 '756733' '6604766' '756733,6604766' --pdf -o paramscan_results.pdf
>>> from quantrocket.moonshot import scan_parameters
import moonchart
>>> scan_parameters("dma-tech",
                    param1="SLIPPAGE_BPS", vals1=["default",None,2,100],
                    param2="EXCLUDE_CONIDS", vals2=[756733,6604766,[756733,6604766]],
                    filepath_or_buffer="paramscan_results.csv")
>>> t = moonchart.ParamscanTearsheet()
>>> t.from_moonshot_csv("paramscan_results.csv")
$ curl -X POST 'http://houston/moonshot/paramscans.csv?strategies=dma-tech&param1=SLIPPAGE_BPS&vals1=default&vals1=None&vals1=2&vals1=100&param2=EXCLUDE_CONIDS&vals2=756733&vals2=6604766&vals2=%5B756733%2C+6604766%5D' > paramscan_results.pdf
Parameter values are converted to strings, sent over HTTP to the moonshot service, then converted back to the appropriate types by the moonshot service using Python's built-in eval() function.

Segmented parameter scans

As with backtests, you can run segmented parameter scans to reduce memory usage:

$ quantrocket moonshot paramscan 'big-boy' -s '2000-01-01' -e '2018-01-01' --segment 'A' -p 'MAVG_WINDOW' -v 20 40 60 -o paramscan_result.csv
>>> from quantrocket.moonshot import scan_parameters
>>> scan_parameters("big-boy", start_date="2001-01-01", end_date="2018-01-01", segment="A", param1="MAVG_WINDOW", vals1=[20,40,60], filepath_or_buffer="paramscan_result.csv")
$ curl -X POST 'http://houston/moonshot/paramscans.csv?strategies=big-boy&start_date=2000-01-01&end_date=2018-01-01&segment=A&param1=MAVG_WINDOW&vals1=20&vals1=40&vals1=60'

Learn more about segmented backtests in the section on backtesting.

Moonshot development workflow

Interactive strategy development in Jupyter

Working with DataFrames is much easier when done interactively. You can follow and validate the transformations at each step, rather than having to write lots of code and run a complete backtest only to wonder why the results don't match what you expected.

Luckily, Moonshot is a simple, fairly "raw" framework that doesn't perform lots of invisible, black-box magic, making it straightforward to step through your DataFrame transformations in a notebook and later transfer your working code to a .py file.

To interactively develop our moving average crossover strategy, define a simple Moonshot class that points to your history database:

from moonshot import Moonshot
class DualMovingAverageStrategy(Moonshot):
    DB = "tech-giants-1d"
To see other built-in parameters you might define besides DB, check the Moonshot docstring by typing: Moonshot?

Instantiate the strategy and get a DataFrame of prices:

self = DualMovingAverageStrategy()
prices = self.get_historical_prices(start_date="2016-01-01")

This is the same prices DataFrame that will be passed to your prices_to_signals method in a backtest, so you can now interactively implement your logic to produce a DataFrame of signals from the DataFrame of prices (peeking at the intermediate DataFrames as you go):

closes = prices.loc["Close"]

# Compute long and short moving averages
# (later we should move the window lengths to class attributes
# so we can edit them more easily and run parameter scans)
lmavgs = closes.rolling(300).mean()
smavgs = closes.rolling(100).mean()

# Go long when short moving average is above long moving average
signals = smavgs.shift() > lmavgs.shift()

# Turn signals from booleans into ints
signals = signals.astype(int)
Attaching a code console to a notebook in JupyterLab provides a convenient "scratch pad" where you can peek at DataFrames or run one-off commands without cluttering your notebook.

In a backtest your signals DataFrame will be passed to your signals_to_target_weights method, so now work on the logic for that method. In this case it's easy:

# spread our capital equally among our trades on any given day
weights = self.allocate_equal_weights(signals)

Next, transform the target weights into a positions DataFrame; this will become the logic of your strategy's target_weights_to_positions method:

# we'll enter in the period after the signal
positions = weights.shift()

Finally, compute gross returns from your positions; this will become positions_to_gross_returns:

# Our return is the security's close-to-close return, multiplied by
# the size of our position. We must shift the positions DataFrame because
# we don't have a return until the period after we open the position
closes = prices.loc["Close"]
gross_returns = closes.pct_change() * positions.shift()

Once you've stepped through this process and your code appears to be doing what you expect, you can create a .py file for your strategy and copy your code into it, then run a full backtest.

Don't forget to add a CODE attribute to your strategy class at this point to identify it (e.g. "dma-tech"). The class name of your strategy and the name of the file in which you store it don't matter; only the CODE is used to identify the strategy throughout QuantRocket.

Save custom DataFrames to backtest results

You can add custom DataFrames to your backtest results, in addition to the DataFrames that are included by default. For example, you might save the computed moving averages:

def prices_to_signals(self, prices):
    closes = prices.loc["Close"]
    mavgs = closes.rolling(50).mean()
    self.save_to_results("MAvg", mavgs)
    ...

After running a backtest with details=True, the resulting CSV will contain the custom DataFrame:

>>> results = pd.read_csv("dma_tech_results.csv", parse_dates=["Date"], index_col=["Field","Date"])
>>> mavgs = results.loc["MAvg"]
>>> mavgs.head()
            AAPL(265598)  AMZN(3691937)  NFLX(15124833)  GOOGL(208813719)
Date
2008-12-22      17.31265        62.4673         3.80260         190.40620
2008-12-23      17.21225        62.2206         3.79965         189.55615
2008-12-24      17.11485        61.9779         3.79620         188.75510
2008-12-26      17.00795        61.7046         3.79415         187.85675
2008-12-29      16.89715        61.4177         3.79120         186.91120
Custom DataFrames are only returned when running single-strategy backtests using the --details/details=True option.

Debugging Moonshot strategies

There are several options for debugging your strategies.

First, you can interactively develop the strategy in a notebook. This is particularly helpful in the early stages of development.

Second, if your strategy is already in a .py file, you can save custom DataFrames to your backtest output and try to see what's going on.

Third, you can add print statements to your .py file, which will show up in flightlog's detailed logs. Open a terminal and start streaming the logs:

$ quantrocket flightlog stream -d

Then run your backtest from a notebook or another terminal.

If you want to inspect or debug the Moonshot library itself (we hope it's so solid you never need to!), a good tactic is to find the relevant method from the base Moonshot class and copy and paste it into your own strategy:

class MyStrategy(Moonshot):

    ...
    # copied from GitHub
    def backtest(self, start_date=None, end_date=None):
        self.is_backtest = True
        ...

This will override the corresponding method on the base Moonshot class, so you can now add print statements to your copy of the method and they'll show up in flightlog.

Strategy inheritance

Often, you may want to re-use a strategy's logic while changing some of the parameters. For example, perhaps you'd like to run an existing strategy on a different market. To do so, simply subclass your existing strategy and modify the parameters as needed. Let's try our dual moving average strategy on a group of ETFs. First, get the historical data for the ETFs:

$ # get a handful of ETF listings...
$ quantrocket master listings --exchange 'ARCA' --symbols 'SPY' 'XLF' 'EEM' 'VNQ' 'XOP' 'GDX'
status: the listing details will be collected asynchronously
$ # monitor flightlog for listing details to be collected, then make a universe:
$ quantrocket master get -e 'ARCA' -s 'SPY' 'XLF' 'EEM' 'VNQ' 'XOP' 'GDX' | quantrocket master universe 'etf-sampler' -f -
code: etf-sampler
inserted: 6
provided: 6
total_after_insert: 6
$ # get 1 day bars for the ETFs
$ quantrocket history create-db 'etf-sampler-1d' -u 'etf-sampler' --bar-size '1 day'
status: successfully created quantrocket.history.etf-sampler-1d.sqlite
$ quantrocket history collect 'etf-sampler-1d'
status: the historical data will be collected asynchronously

Since we're inheriting from an existing strategy, implementing our strategy is easy:

# derive a strategy from DualMovingAverageStrategy (defined earlier in the file)
class DualMovingAverageStrategyETF(DualMovingAverageStrategy):

    CODE = "dma-etf"
    DB = "etf-sampler-1d"
    LMAVG_WINDOW = 300
    SMAVG_WINDOW = 100

Now we can run our backtest:

$ quantrocket moonshot backtest 'dma-etf' -s '2005-01-01' -e '2017-01-01' --pdf -o dma_etf_tearsheet.pdf --details
>>> from quantrocket.moonshot import backtest
>>> backtest("dma-etf", start_date="2005-01-01", end_date="2017-01-01",
             filepath_or_buffer="dma_etf.csv", details=True)
$ curl -X POST 'http://houston/moonshot/backtests?strategies=dma-etf&start_date=2005-01-01&end_date=2017-01-01&pdf=true' > dma_etf_tearsheet.pdf

Code organization

Your Moonshot code should be placed in the /codeload/moonshot subdirectory inside JupyterLab. QuantRocket recursively scans .py files in this directory and loads your strategies (a strategy is defined as a subclass of moonshot.Moonshot). You can place as many strategies as you like within a single .py file, or you can place them in separate files. If you like, you can organize your .py files into subdirectories as you see fit.

If you want to re-use code across multiple files, you can do so using standard Python import syntax. Any .py files in or under the /codeload directory inside Jupyter (that is, any .py files you can see in the Jupyter file browser) can be imported from codeload. For example, consider a simple directory structure containing two files for your strategies and one file with helper functions used by multiple strategies:

/codeload/moonshot/helpers.py
/codeload/moonshot/meanreversion_strategies.py
/codeload/moonshot/momentum_strategies.py

Suppose you've implemented a function in helpers.py called rebalance_positions. You can import and use the function in another file like so:

from codeload.moonshot.helpers import rebalance_positions

Importing also works if you're using subdirectories:

/codeload/moonshot/helpers/rebalance.py
/codeload/moonshot/meanreversion/buythedip.py
/codeload/moonshot/momentum/hml.py

Just use standard Python dot syntax to reach your modules wherever they are in the directory tree:

from codeload.moonshot.helpers.rebalance import rebalance_positions
To make your code importable as a standard Python package, the 'codeload' directory and each subdirectory must contain a __init__.py file. QuantRocket will create these files automatically if they don't exist.

Interactive order creation in Jupyter

This section might make more sense after reading about live trading.

Just as you can interactively develop your Moonshot backtest code in Jupyter, you can use a similar approach to develop your order_stubs_to_orders method.

First, import and instantiate your strategy:

from codeload.moonshot.dual_moving_average import DualMovingAverageTechGiantsStrategy
self = DualMovingAverageTechGiantsStrategy()

Next, run the trade method, which returns a DataFrame of orders. You'll need to pass at least one account allocation (normally this would be pulled from quantrocket.moonshot.allocations.yml).

allocations = {"DU12345": 1.0}
orders = self.trade(allocations)
The account must be a valid account as Moonshot will try to pull the account balance from the account service. You can run quantrocket account balance --latest to make sure account history is available for the account.
If self.trade() returns no orders, you can pass a review_date to generate orders for an earlier date, and/or modify prices_to_signals to create some trades for the purpose of testing.

If your strategy hasn't overridden order_stubs_to_orders, you'll receive the orders DataFrame as processed by the default implementation of order_stubs_to_orders on the Moonshot base class. You can return the orders to the state in which they were passed to order_stubs_to_orders by dropping a few columns:

# revert to minimal order stubs
orders = orders.drop(["OrderType", "Tif", "Exchange"], axis=1)

You can now experiment with modifying your orders DataFrame. For example, re-add the required fields:

orders["Exchange"] = "SMART"
orders["OrderType"] = "MKT"
orders["Tif"] = "DAY"

Or attach exit orders:

child_orders = self.orders_to_child_orders(orders)
child_orders.loc[:, "OrderType"] = "MOC"
orders = pd.concat([orders, child_orders])

To use the prices DataFrame for order creation (for example, to set limit prices), query recent historical prices. (To learn more about the historical data start date used in live trading, see the section on lookback windows.)

prices = self.get_historical_prices("2018-04-01")

Now create limit prices set to the prior close:

closes = prices.loc["Close"]
prior_closes = closes.shift()
prior_closes = self.reindex_like_orders(prior_closes, orders)
orders["OrderType"] = "LMT"
orders["LmtPrice"] = prior_closes

Intraday strategies

Moonshot supports intraday strategies that making trading decisions at most once per day.

When your strategy points to an intraday history database, the strategy receives a DataFrame of intraday prices and must "reduce" the intraday prices to a DataFrame of daily signals. Let's look at an example.

Consider a simple "trend day" strategy using several ETFs: if the ETF is up (down) more than 2% from yesterday's close as of 2:00 PM, buy (sell) the ETF at 2:15 PM and exit the position at the market close.

First, get the historical data for the ETFs:

$ # get a handful of ETF listings...
$ quantrocket master listings --exchange 'ARCA' --symbols 'SPY' 'XLF' 'EEM' 'VNQ' 'XOP' 'GDX'
status: the listing details will be collected asynchronously
$ # monitor flightlog for listing details to be collected, then make a universe:
$ quantrocket master get -e 'ARCA' -s 'SPY' 'XLF' 'EEM' 'VNQ' 'XOP' 'GDX' | quantrocket master universe 'etf-sampler' -f -
code: etf-sampler
inserted: 6
provided: 6
total_after_insert: 6
$ # get 15-min bars for the ETFs
$ quantrocket history create-db 'etf-sampler-15min' -u 'etf-sampler' --bar-size '15 mins'
status: successfully created quantrocket.history.etf-sampler-15min.sqlite
$ quantrocket history collect 'etf-sampler-15min'
status: the historical data will be collected asynchronously

Define a Moonshot strategy and point it to the intraday history database:

class TrendDayStrategy(Moonshot):

    CODE = 'trend-day'
    DB = 'etf-sampler-15min'
    DB_TIME_FILTERS = ['14:00:00', '15:45:00']
    DB_FIELDS = ['Open','Close']
Note the use of DB_TIME_FILTERS and DB_FIELDS to limit the amount of data loaded into the backtest. Loading only the data you need is an important performance optimization for intraday strategies with large universes (albeit unnecessary in this particular example since the universe is small).

Working with intraday prices in Moonshot is identical to working with intraday prices in historical research. We use .xs to select particular times of day from the prices DataFrame, thereby reducing the DataFrame from intraday to daily. In this way our prices_to_signals method calculates the return from yesterday's close to 2:00 PM and uses it to make trading decisions:

def prices_to_signals(self, prices):

    closes = prices.loc["Close"]
    opens = prices.loc["Open"]

    # Take a cross section (xs) of prices to get a specific time's price;
    # the close of the 15:45 bar is the session close
    session_closes = closes.xs("15:45:00", level="Time")
    # the open of the 14:00 bar is the 14:00 price
    afternoon_prices = opens.xs("14:00:00", level="Time")

    # calculate the return from yesterday's close to 14:00
    prior_closes = session_closes.shift()
    returns = (afternoon_prices - prior_closes) / prior_closes

    # Go long if up more than 2%, go short if down more than -2%
    long_signals = returns > 0.02
    short_signals = returns < -0.02

    # Combine long and short signals
    signals = long_signals.astype(int).where(long_signals, -short_signals.astype(int))
    return signals

If you step through this code interactively, you'll see that after the use of .xs to select particular times of day from the prices DataFrame, all subsequent DataFrames have dates in the index but not times, just like with an end-of-day strategy.

Because our prices_to_signals method has reduced intraday prices to daily signals, our signals_to_target_weights and target_weights_to_positions methods don't need to do any special "intraday handling" and therefore look similar to how they might look for a daily strategy:

def signals_to_target_weights(self, signals, prices):

    # allocate 20% of capital to each position, or equally divide capital
    # among positions, whichever is less
    target_weights = self.allocate_fixed_weights_capped(signals, 0.20, cap=1.0)
    return target_weights

def target_weights_to_positions(self, target_weights, prices):

    # We enter on the same day as the signals/target_weights
    positions = target_weights.copy()
    return positions

To calculate gross returns, we select the intraday prices that correspond to our entry and exit times and multiply the security's return by our position size:

def positions_to_gross_returns(self, positions, prices):

    closes = prices.loc["Close"]

    # Our signal came at 14:00 and we enter at 14:15 (the close of the 14:00 bar)
    entry_prices = closes.xs("14:00:00", level="Time")
    session_closes = closes.xs("15:45:00", level="Time")

    # Our return is the 14:15-16:00 return, multiplied by the position
    pct_changes = (session_closes - entry_prices) / entry_prices
    gross_returns = pct_changes * positions
    return gross_returns

Now we can run the backtest:

$ quantrocket moonshot backtest 'trend-day' --pdf -o trend_day.pdf --details
>>> from quantrocket.moonshot import backtest
>>> import moonchart
>>> backtest("trend-day", details=True, filepath_or_buffer="trend_day.csv")
>>> t = moonchart.Tearsheet()
>>> t.from_moonshot_csv("trend_day.csv")
$ curl -X POST 'http://houston/moonshot/backtests.pdf?strategies=trend-day&pdf=true'  -o trend_day.pdf

And view the performance:

moonshot trend day tearsheet

You can view the complete trend day strategy in the demo repository.

Commissions and slippage

Commissions

Moonshot supports realistic modeling of IB commissions. To model commissions, subclass the appropriate commission class, set the commission costs as per IB's website, then add the commission class to your strategy:

from moonshot import Moonshot
from moonshot.commission import PercentageCommission

class JapanStockFixedCommission(PercentageCommission):
    # look up commission costs on IB's website
    IB_COMMISSION_RATE = 0.0008 # 0.08% of trade value
    MIN_COMMISSION = 80.00 # JPY

class MyJapanStrategy(Moonshot):
    COMMISSION_CLASS = JapanStockFixedCommission
Because commission costs change from time to time, and because some cost components depend on account specifics such as your monthly trade volume or the degree to which you add or remove liquidity, Moonshot provides the commission logic but expects you to fill in the specific cost constants.

Percentage commissions

Use moonshot.commission.PercentageCommission where IB's commission is calculated as a percentage of the trade value. If you're using the tiered commission structure, you can also set an exchange fee (as a percentage of trade value). A variety of examples are shown below:

from moonshot.commission import PercentageCommission

class MexicoStockCommission(PercentageCommission):
    IB_COMMISSION_RATE = 0.0010
    MIN_COMMISSION = 60.00 # MXN

class SingaporeStockTieredCommission(PercentageCommission):
    IB_COMMISSION_RATE = 0.0008
    EXCHANGE_FEE_RATE = 0.00034775 + 0.00008025 # transaction fee + access fee
    MIN_COMMISSION = 2.50 # SGD

class UKStockTieredCommission(PercentageCommission):
    IB_COMMISSION_RATE = 0.0008
    EXCHANGE_FEE_RATE = 0.000045 + 0.0025 # 0.45 bps + 0.5% stamp tax on purchases > 1000 GBP
    MIN_COMMISSION = 1.00 # GBP

class HongKongStockTieredCommission(PercentageCommission):
    IB_COMMISSION_RATE = 0.0008
    EXCHANGE_FEE_RATE = (
          0.00005 # exchange fee
        + 0.00002 # clearing fee (2 HKD min)
        + 0.001 # Stamp duty
        + 0.000027 # SFC Transaction Levy
    )
    MIN_COMMISSION = 18.00 # HKD

class JapanStockTieredCommission(PercentageCommission):
    IB_COMMISSION_RATE = 0.0005 # 0.08% of trade value
    EXCHANGE_FEE_RATE = 0.00002 + 0.000004 # 0.002% Tokyo Stock Exchange fee + 0.0004% clearing fee
    MIN_COMMISSION = 80.00 # JPY

Per Share commissions

Use moonshot.commission.PerShareCommission to model commissions which are assessed per share (US and Canada stock commissions). Here is an example of a fixed commission for US stocks:

from moonshot.commission import PerShareCommission

class USStockFixedCommission(PerShareCommission):
    IB_COMMISSION_PER_SHARE = 0.005
    MIN_COMMISSION = 1.00

IB Cost-Plus commissions can be complex; in addition to the IB commission they may include exchange fees which are assessed per share (and which may differ depending on whether you add or remove liqudity), fees which are based on the trade value, and fees which are assessed as a percentage of the IB comission itself. These can also be modeled:

class CostPlusUSStockCommission(PerShareCommission):
    IB_COMMISSION_PER_SHARE = 0.0035
    EXCHANGE_FEE_PER_SHARE = (0.0002 # clearing fee per share
                             + (0.000119/2)) # FINRA activity fee (per share sold so divide by 2)
    MAKER_FEE_PER_SHARE = -0.002 # exchange rebate (varies)
    TAKER_FEE_PER_SHARE = 0.00118 # exchange fee (varies)
    MAKER_RATIO = 0.25 # assume 25% of our trades add liquidity, 75% take liquidity
    COMMISSION_PERCENTAGE_FEE_RATE = (0.000175 # NYSE pass-through (% of IB commission)
                                     + 0.00056) # FINRA pass-through (% of IB commission)
    PERCENTAGE_FEE_RATE = 0.0000231 # Transaction fees as a percentage of trade value
    MIN_COMMISSION = 0.35 # USD

class CanadaStockCommission(PerShareCommission):
    IB_COMMISSION_PER_SHARE = 0.008
    EXCHANGE_FEE_PER_SHARE = (
        0.00017 # clearing fee per share
        + 0.00011 # transaction fee per share
        )
    MAKER_FEE_PER_SHARE = -0.0019 # varies
    TAKER_FEE_PER_SHARE = 0.003 # varies
    MAKER_RATIO = 0 # assume we always take liqudity
    MIN_COMMISSION = 1.00 # CAD

Futures commissions

moonshot.commission.FuturesCommission lets you define a commission, exchange fee, and carrying fee per contract:

from moonshot.commission import FuturesCommission

class GlobexEquityEMiniFixedCommission(FuturesCommission):
    IB_COMMISSION_PER_CONTRACT = 0.85
    EXCHANGE_FEE_PER_CONTRACT = 1.18
    CARRYING_FEE_PER_CONTRACT = 0 # Depends on equity in excess of margin requirement

Forex commissions

Spot forex commissions are percentage-based, so moonshot.commission.SpotForexCommission can be used directly without subclassing:

from moonshot import Moonshot
from moonshot.commission import SpotForexCommission

class MyForexStrategy(Moonshot):
    COMMISSION_CLASS = SpotForexCommission

Note that at present, SpotForexCommission does not model minimum commissions (this has to do with the fact that the minimum commission for forex is always expressed in USD, rather than the currency of the traded security). This limitation means that if your trades are small, SpotForexCommission may underestimate the commission.

Minimum commissions

During backtests, Moonshot calculates and assesses commissions in percentage terms (relative to the capital allocated to the strategy) rather than in dollar terms. However, since minimum commissions are expressed in dollar terms, Moonshot must know your NLV (Net Liquidation Value, i.e. account balance) in order to accurately model minimum commissions in backtests. You can specify your NLV in your strategy definition or at the time you run a backtest.

If you trade in size and are unlikely ever to trigger minimum commissions, you don't need to model them.

NLV should be provided as key-value pairs of CURRENCY:NLV. You must provide the NLV in each currency you wish to model. For example, if your account balance is $100K USD, and your strategy trades instruments denominated in JPY and AUD, you could specify this on the strategy:

class MyAsiaStrategy(Moonshot):
    CODE = "my-asia-strategy"
    NLV = {
        "JPY": 100000 * 110, # 110 JPY per USD
        "AUD": 100000 * 1.25 # 1.25 AUD per USD
    }

Or pass the NLV at the time you run the backtest:

$ quantrocket moonshot backtest 'my-asia-strategy' --nlv 'JPY:11000000' 'AUD:125000' -o asia.csv
>>> backtest("my-asia-strategy", nlv={"JPY":11000000, "AUD":125000},
             filepath_or_buffer="asia.csv")
$ curl -X POST 'http://houston/moonshot/backtests.csv?strategies=my-asia-strategy&nlv=JPY%3A11000000&nlv=AUD%3A125000' > asia.csv
If you don't specify NLV on the strategy or via the nlv option, the backtest will still run, it just won't take into account minimum commissions.

Multiple commission structures on the same strategy

You might run a strategy that trades multiple securities with different commission structures. Instead of specifying a single commission class, you can specify a Python dictionary associating each commission class with the respective security type, exchange, and currency it applies to:

class USStockFixedCommission(PerShareCommission):
    IB_COMMISSION_PER_SHARE = 0.005
    MIN_COMMISSION = 1.00

class GlobexEquityEMiniFixedCommission(FuturesCommission):
    IB_COMMISSION_PER_CONTRACT = 0.85
    EXCHANGE_FEE_PER_CONTRACT = 1.18

class MultiSecTypeStrategy(Moonshot):
    # this strategy trades NYSE and NASDAQ stocks and GLOBEX futures
    COMMISSION_CLASS = {
        # dict keys should be tuples of (security type, exchange, currency)
        ("STK", "NYSE", "USD"): USStockFixedCommission,
        ("STK", "NASDAQ", "USD"): USStockFixedCommission,
        ("FUT", "GLOBEX", "USD"): GlobexEquityEMiniFixedCommission
    }

Slippage

Fixed slippage

You can apply a fixed amount of slippage (in basis points) to the trades in your backtest by setting SLIPPAGE_BPS on your strategy:

class MyStrategy(Moonshot):
    ...
    SLIPPAGE_BPS = 5

The above will apply 5 basis point of one-way slippage to each trade. If you expect different slippage for entry vs exit, take the average.

Parameter scans are a handy way to check your strategy's sensitivity to slippage:

$ quantrocket moonshot paramscan 'my-strategy' -p 'SLIPPAGE_BPS' -v 0 2.5 5 10 --pdf -o my_strategy_slippage.pdf
>>> from quantrocket.moonshot import scan_parameters
>>> scan_parameters("my-strategy",
                    param1="SLIPPAGE_BPS", vals1=[0,2.5,5,10],
                    filepath_or_buffer="my_strategy_slippage.csv")
$ curl -X POST 'http://houston/moonshot/paramscans.pdf?strategies=my-strategy&param1=SLIPPAGE_BPS&vals1=0&vals1=2.5&vals1=5&vals1=10' > my_strategy_slippage.pdf

You can research bid-ask spreads for the purpose of estimating slippage by collecting intraday historical data using the BID, ASK, or BID_ASK bar types.

Commissions and slippage for intraday positions

If you run an intraday strategy that closes its positions the same day it opens them, you should set a parameter (POSITIONS_CLOSED_DAILY, see below) to tell Moonshot you're doing this so that it can more accurately assess commissions and slippage. Here's why:

Moonshot calculates commissions and slippage by first diff()ing the positions DataFrame in your backtest to calculate the day-to-day turnover. For example, suppose we entered a position in AAPL, then reduced the position the next day, then maintained the position for a day, then closed the position. Our holdings look like this:

>>> positions.head()
         AAPL(265598)
Date
2012-01-06      0.000
2012-01-06      0.500 # buy position worth 50% of capital
2012-01-09      0.333 # reduce position to 33% of capital
2012-01-12      0.333 # hold position
2012-01-12      0.000 # close out position

The corresponding DataFrame of trades, representing our turnover due to opening and closing the position, would look like this:

>>> trades = positions.diff()
>>> trades.head()
         AAPL(265598)
Date
2012-01-06       NaN
2012-01-06      0.500 # buy position worth 50% of capital
2012-01-09     -0.167 # reduce position to 33% of capital
2012-01-12      0.000 # hold position
2012-01-12     -0.333 # close out position

Commissions and slippage are applied against this DataFrame of trades.

The default use of diff() to calculate trades from positions involves an assumption: that adjacent, same-side positions in the positions DataFrame represent continuous holdings. For strategies that close out their positions each day, this assumption isn't correct. For example, the positions DataFrame from above might actually indicate 3 positions opened and closed on 3 consecutive days, rather than 1 continuously held position:

>>> positions.head()
         AAPL(265598)
Date
2012-01-06      0.000
2012-01-06      0.500 # open and close out a position worth 50% of capital
2012-01-09      0.333 # open and close out a position worth 33% of capital
2012-01-12      0.333 # open and close out a position worth 33% of capital
2012-01-12      0.000

If so, diff() will underestimate turnover and thus underestimate commissions and slippage. The correct calculation of turnover is to multiply the positions by 2:

>>> trades = positions * 2
>>> trades.head()
         AAPL(265598)
Date
2012-01-06      0.000
2012-01-06      1.000 # buy 0.5 + sell 0.5
2012-01-09      0.667 # buy 0.33 + sell 0.33
2012-01-12      0.667 # buy 0.33 + sell 0.33
2012-01-12      0.000

As there is no reliable way for Moonshot to infer automatically whether adjacent, same-side positions are continuously held or closed out daily, you must set POSITIONS_CLOSED_DAILY = True on the strategy if you want Moonshot to assume they are closed out daily:

class TrendDay(Moonshot):
    ...
    POSITIONS_CLOSED_DAILY = True

Otherwise, Moonshot will assume that adjacent, same-side positions are continuously held.

Position size constraints

Liquidity constraints

Instead of or in addition to limiting position sizes as described below, also consider using VWAP or other algorithmic orders to trade in size if you have a large account and/or wish to trade illiquid securities. VWAP orders can be modeled in backtests as well as used in live trading.

A backtest that assumes it is possible to buy or sell any security you want in any size you want is likely to be unrealistic. In the real world, a security's liquidity constrains the number of shares it is practical to buy or sell.

Maximum position sizes for long and short positions can be defined in your strategy's limit_position_sizes method. If defined, this method should return two DataFrames, one defining the maximum quantities (i.e. shares or contracts) allowed for longs and a second defining the maximum quantities allowed for shorts. The following example limits quantities to 1% of 15-day average daily volume:

def limit_position_sizes(self, prices):
    volumes = prices.loc["Volume"] # assumes end-of-day bars, for intraday bars use `.xs`
    mean_volumes = volumes.rolling(15).mean()
    max_shares = (mean_volumes * 0.01).round()
    max_quantities_for_longs = max_quantities_for_shorts = max_shares.shift()
    return max_quantities_for_longs, max_quantities_for_shorts

The returned DataFrames might resemble the following:

>>> max_quantities_for_longs.head()
ConId       1234 2345
Date
2018-05-18   100  200
2018-05-19   100  200
>>> max_quantities_for_shorts.head()
ConId       1234 2345
Date
2018-05-18   100  200
2018-05-19   100  200

In the above example, our strategy will be allowed to long or short at most 100 shares of ConId 1234 and 200 shares of ConId 2345.

Note that max_quantities_for_shorts can equivalently be represented with positive or negative numbers. Values of 100 and -100 are both interpreted to mean: short no more than 100 shares. (The same applies to max_quantities_for_longs—only the absolute value matters).

The shape and alignment of the returned DataFrames should match that of the target_weights returned by signals_to_target_weights. Target weights will be reduced, if necessary, so as not to exceed max_quantities_for_longs and max_quantities_for_shorts. Position size limits are applied in backtesting and in live trading.

You can return None for one or both DataFrames to indicate "no limits" (this is the default implementation in the Moonshot base class). For example to limit shorts but not longs:

def limit_position_sizes(self, prices):
    ...
    return None, max_quantities_for_shorts

Within a DataFrame, any None or NaN will be treated as "no limit" for that particular security and date.

If you define position size limits for longs or shorts or both, you must specify the NLV to use for the backtest. This is because the target_weights returned by signals_to_target_weights are expressed as percentages of capital, and NLV is required for Moonshot to convert the percentage weights to the corresponding number of shares/contracts so that the position size limits can be enforced. NLV should be provided as key-value pairs of CURRENCY:NLV, and should be provided for each currency represented in the strategy. For example, if your account balance is $100K USD, and your strategy trades instruments denominated in JPY and USD, you could specify NLV on the strategy:

class MyStrategy(Moonshot):
    CODE = "my-strategy"
    NLV = {
        "USD": 100000,
        "JPY": 100000 * 110, # 110 JPY per USD
    }

Or pass the NLV at the time you run the backtest:

$ quantrocket moonshot backtest 'my-strategy' --nlv 'JPY:11000000' 'USD:100000' -o backtest_results.csv
>>> backtest("my-strategy", nlv={"JPY":11000000, "USD":100000},
             filepath_or_buffer="backtest_results.csv")
$ curl -X POST 'http://houston/moonshot/backtests.csv?strategies=my-strategy&nlv=JPY%3A11000000&nlv=USD%3A100000' > backtest_results.csv

Short sale constraints

You can use short sale availability data from IB to model short sale constraints in your backtests, including the available quantity of shortable shares and the associated borrow fees for overnight positions.

Shortable sales

One way to use shortable shares data is to enforce position limits based on share availability:

def limit_position_sizes(self, prices):
    max_shares_for_shorts = get_shortable_shares_reindexed_like(prices.loc["Close"])
    return None, max_shares_for_shorts

Shortable shares data is available back to April 16, 2018. Prior to that date, get_shortable_shares_reindexed_like will return NaNs, which are interpreted by Moonshot as "no limit on position size".

Due to the limited historical depth of shortable shares data, a useful approach is to develop your strategy without modeling short sale constraints, then run a parameter scan starting at April 16, 2018 to compare the performance with and without short sale constraints. Add a parameter to make your short sale constraint code conditional:

class ShortSaleStrategy(Moonshot):

    CODE = "shortseller"
    CONSTRAIN_SHORTABLE = False
    ...
    def limit_position_sizes(self, prices):
        if self.CONSTRAIN_SHORTABLE:
            max_shares_for_shorts = get_shortable_shares_reindexed_like(prices.loc["Close"])
        else:
            max_shares_for_shorts = None
        return None, max_shares_for_shorts

Then run the parameter scan:

$ quantrocket moonshot paramscan 'shortseller' -p 'CONSTRAIN_SHORTABLE' -v True False -s '2018-04-16' --nlv 'USD:1000000' --pdf -o shortseller_CONSTRAIN_SHORTABLE.pdf
>>> from quantrocket.moonshot import scan_parameters
>>> import moonchart
>>> scan_parameters("shortseller", start_date="2018-04-16",
                    param1="CONSTRAIN_SHORTABLE", vals1=[True,False],
                    nlv={"USD":1000000},
                    filepath_or_buffer="shortseller_CONSTRAIN_SHORTABLE.csv")
>>> t = moonchart.ParamscanTearsheet()
>>> t.from_moonshot_csv("shortseller_CONSTRAIN_SHORTABLE.csv")
$ curl -X POST 'http://houston/moonshot/paramscans?strategies=shortseller&start_date=2018-04-16&param1=CONSTRAIN_SHORTABLE&vals1=True&vals1=False&pdf=true&nlv=USD%3A1000000' > shortseller_CONSTRAIN_SHORTABLE.pdf

Borrow fees

You can use a built-in slippage class to assess borrow fees on your strategy's overnight short positions. (Note that IB does not assess borrow fees on intraday positions.)

from moonshot import Moonshot
from moonshot.slippage import BorrowFees

class ShortSaleStrategy(Moonshot):

    CODE = "shortseller"
    SLIPPAGE_CLASSES = BorrowFees
    ...

The BorrowFees slippage class uses get_borrow_fees_reindexed_like to query annualized borrow fees, divides them by 252 (the approximate number of trading days in a year) to get a daily rate, and applies the daily rate to your short positions in backtesting. No fees are applied prior to the data's start date of April 16, 2018.

To run a parameter scan with and without borrow fees, add the BorrowFees slippage as shown above and run a scan on the SLIPPAGE_CLASSES parameter with values of "default" (to test the strategy as-is, that is, with borrow fees) and "None":

$ quantrocket moonshot paramscan 'shortseller' -p 'SLIPPAGE_CLASSES' -v 'default' 'None' -s '2018-04-16' --nlv 'USD:1000000' --pdf -o shortseller_with_and_without_borrow_fees.pdf
>>> from quantrocket.moonshot import scan_parameters
>>> import moonchart
>>> scan_parameters("shortseller", start_date="2018-04-16",
                    param1="SLIPPAGE_CLASSES", vals1=["default",None],
                    nlv={"USD":1000000},
                    filepath_or_buffer="shortseller_with_borrow_fees.csv")
>>> t = moonchart.ParamscanTearsheet()
>>> t.from_moonshot_csv("shortseller_with_and_without_borrow_fees.csv")
$ curl -X POST 'http://houston/moonshot/paramscans?strategies=shortseller&start_date=2018-04-16&param1=SLIPPAGE_CLASSES&vals1=default&vals1=None&pdf=true&nlv=USD%3A1000000' > shortseller_with_and_without_borrow_fees.pdf

Live trading

Live trading quickstart

Live trading with Moonshot can be thought of as running a backtest on up-to-date historical data and placing a batch of orders based on the latest signals generated by the backtest.

Recall the moving average crossover strategy from the backtesting quickstart:

from moonshot import Moonshot

class DualMovingAverageStrategy(Moonshot):

    CODE = "dma-tech"
    DB = "tech-giants-1d"
    LMAVG_WINDOW = 300
    SMAVG_WINDOW = 100

    def prices_to_signals(self, prices):
        closes = prices.loc["Close"]

        # Compute long and short moving averages
        lmavgs = closes.rolling(self.LMAVG_WINDOW).mean()
        smavgs = closes.rolling(self.SMAVG_WINDOW).mean()

        # Go long when short moving average is above long moving average
        signals = smavgs.shift() > lmavgs.shift()

        return signals.astype(int)

To trade the strategy, the first step is to define one or more accounts (live or paper) in which you want to run the strategy, and how much of each account's capital to allocate. Accounts allocations should be defined in quantrocket.moonshot.allocations.yml, located in the /codeload directory in Jupyter (that is, in the top-level directory of the Jupyter file browser). Allocations should be expressed as a decimal percent of the total capital (Net Liquidation Value) of the account:

# quantrocket.moonshot.allocations.yml
#
# This file defines the percentage of total capital (Net Liquidation Value)
# to allocate to Moonshot strategies.
#

# each top level key is an account number
DU12345:
    # each second-level key-value is a strategy code and the percentage
    # of Net Liquidation Value to allocate
    dma-tech: 0.75  # allocate 75% of DU12345's Net Liquidation Value to dma-tech

Next, bring your history database up-to-date if you haven't already done so:

$ quantrocket history collect 'tech-giants-1d'
status: the historical data will be collected asynchronously
>>> from quantrocket.history import collect_history
>>> collect_history("tech-giants-1d")
{'status': 'the historical data will be collected asynchronously'}
$ curl -X POST 'http://houston/history/queue?codes=tech-giants-1d'
{"status": "the historical data will be collected asynchronously"}

Now you're ready to run the strategy. Running the strategy doesn't place any orders but generates a CSV of orders to be placed in a subsequent step:

$ quantrocket moonshot trade 'dma-tech' -o orders.csv
>>> from quantrocket.moonshot import trade
>>> trade("dma-tech", filepath_or_buffer="orders.csv")
$ curl -X POST 'http://houston/moonshot/orders.csv?strategies=dma-tech' > orders.csv

If any orders were generated, the CSV will look something like this:

$ csvlook -I orders.csv
| ConId     | Account | Action | OrderRef | TotalQuantity | Exchange | OrderType | Tif |
| --------- | ------- | ------ | -------- | ------------- | -------- | --------- | --- |
| 265598    | DU12345 | BUY    | dma-tech | 501           | SMART    | MKT       | DAY |
| 3691937   | DU12345 | BUY    | dma-tech | 58            | SMART    | MKT       | DAY |
| 15124833  | DU12345 | BUY    | dma-tech | 284           | SMART    | MKT       | DAY |
| 208813719 | DU12345 | BUY    | dma-tech | 86            | SMART    | MKT       | DAY |
If no orders were generated, there won't be a CSV. If this happens, you can re-run the strategy with the --review-date option to generate orders for an earlier date, and/or modify prices_to_signals to create some trades for the purpose of testing.

Finally, make sure IB Gateway is connected (quantrocket launchpad start) for the account you're trading, then place the orders with QuantRocket's blotter:

$ quantrocket blotter order -f orders.csv
>>> from quantrocket.blotter import place_orders
>>> place_orders(infilepath_or_buffer="orders.csv")
$ curl -X POST 'http://houston/blotter/orders' --upload-file orders.csv

Normally, you will run your live trading in an automated manner from the countdown service using the command line interface (CLI). With the CLI, you can generate and place Moonshot orders in a one-liner by piping the orders CSV to the blotter over stdin (indicated by passing - as the -f/--infile option):

$ quantrocket moonshot trade 'dma-tech' | quantrocket blotter order -f '-'

How live trading works

Live trading in Moonshot starts out just like a backtest:

  1. Prices are queried from your history database
  2. The prices DataFrame is passed to your prices_to_signals method, which returns a DataFrame of signals
  3. The signals DataFrame is passed to signals_to_target_weights, which returns a DataFrame of target weights

At this point, a backtest would proceed to simulate positions (target_weights_to_positions) then simulate returns (positions_to_gross_returns). In contrast, in live trading the target weights must be converted into a batch of live orders to be placed with the broker. This process happens as follows:

  1. First, Moonshot isolates the last row (corresponding to today) from the target weights DataFrame.
  2. Moonshot converts the target weights into the actual number of shares of each security to be ordered in each allocated account, taking into account the overall strategy allocation, the account balance, and any existing positions the strategy already holds.
  3. Moonshot provides you with a DataFrame of "order stubs" containing basic fields such as the account, action (buy or sell), order quantity, and contract ID (ConId).
  4. You can then customize the orders in the order_stubs_to_orders method by adding other order fields such as the order type, time in force, etc.

By default, the base class implementation of order_stubs_to_orders creates MKT DAY orders routed to SMART. The above quickstart example relies on this default behavior, but you should always override order_stubs_to_orders with your own order specifications.

From order stubs to orders

You can specify detailed order parameters in your strategy's order_stubs_to_orders method.

The order stubs DataFrame provided to this method resembles the following:

>>> print(orders)
    ConId  Account Action     OrderRef  TotalQuantity
0   12345   U12345   SELL  my-strategy            100
1   12345   U55555   SELL  my-strategy             50
2   23456   U12345    BUY  my-strategy            100
3   23456   U55555    BUY  my-strategy             50
4   34567   U12345    BUY  my-strategy            200
5   34567   U55555    BUY  my-strategy            100

Modify the DataFrame by appending additional columns. At minimum, you must provide the order type (OrderType), time in force (Tif), and the exchange to route the order to. The default implementation is shown below:

def order_stubs_to_orders(self, orders, prices):
    orders["Exchange"] = "SMART"
    orders["OrderType"] = "MKT"
    orders["Tif"] = "DAY"
    return orders

Moonshot isn't limited to a handful of canned order types. You can use any of the order parameters and order types supported by the IB API. Learn more about required and available order fields in the blotter documentation.

As shown in the above example, Moonshot uses your strategy code (e.g. "my-strategy") to populate the OrderRef field, a field used by the blotter for strategy-level tracking of your positions and performance.

Using prices and securities master fields in order creation

The prices DataFrame used throughout Moonshot is passed to order_stubs_to_orders, allowing you to use prices or securities master fields to create your orders. This is useful, for example, for setting limit prices, or applying different order rules for different exchanges.

The prices DataFrame covers multiple dates while the orders DataFrame represents a current snapshot. You can use the reindex_like_orders method to extract a current snapshot of data from the prices DataFrame. For example, create limit prices set to the prior close:

def order_stubs_to_orders(self, orders, prices):
    closes = prices.loc["Close"]
    prior_closes = closes.shift()
    prior_closes = self.reindex_like_orders(prior_closes, orders)
    orders["OrderType"] = "LMT"
    orders["LmtPrice"] = prior_closes
    ...

Or, direct-route orders to their primary exchange:

def order_stubs_to_orders(self, orders, prices):
    closes = prices.loc["Close"]
    exchanges = prices.loc["PrimaryExchange"].reindex(closes.index, method="ffill")
    exchanges = self.reindex_like_orders(exchanges, orders)
    orders["Exchange"] = exchanges
    ...

Account allocations

Define your strategy allocations in quantrocket.moonshot.allocations.yml, a YAML file located in the /codeload directory in Jupyter (that is, in the top-level directory of the Jupyter file browser). (The demo repository includes an example file.) You can run multiple strategies per account and/or multiple accounts per strategy. Allocations should be expressed as a decimal percent of the total capital (Net Liquidation Value) of the account:

# quantrocket.moonshot.allocations.yml
#
# This file defines the percentage of total capital (Net Liquidation Value)
# to allocate to Moonshot strategies.
#

# each top level key is an account number
DU12345:
    # each second-level key-value is a strategy code and the percentage
    # of Net Liquidation Value to allocate
    dma-tech: 0.75  # allocate 75% of DU12345's Net Liquidation Value to dma-tech
    dma-etf: 0.5 # allocate 50% of DU12345's Net Liquidation Value to dma-etf
U12345:
    dma-tech: 1 # allocate 100% of U12345's Net Liquidation Value to dma-tech

By default, when you trade a strategy, Moonshot generates orders for all accounts which define allocations for that strategy. However, you can limit to particular accounts:

$ quantrocket moonshot trade 'dma-tech' -a 'U12345'

Note that you can also run multiple strategies at a time:

$ quantrocket moonshot trade 'dma-tech' 'dma-etf'

How Moonshot calculates order quantities

The behavior outlined in this section is handled automatically by Moonshot but is provided for informational purposes.

The target weights generated by signals_to_target_weights are expressed in percentage terms (e.g. 0.1 = 10% of capital), but these weights must be converted into the actual numbers of shares, futures contracts, etc. that need to be bought or sold. Converting target weights into order quantities requires taking into account a number of factors including the strategy allocation, account NLV, exchange rates, existing positions, and security price.

The conversion process is outlined below for an account with USD base currency:

StepSourceDomestic stock example - AAPL (NASDAQ)Foreign stock example - BP (London Stock Exchange)Futures example - ES (GLOBEX)
What is target weight?last row (= today) of target weights DataFrame0.20.20.2
What is account allocation for strategy?quantrocket.moonshot.allocations.yml0.50.50.5
What is target weight for account?multiply target weights by account allocations0.1 (0.2 x 0.5)0.1 (0.2 x 0.5)0.1 (0.2 x 0.5)
What is latest account NLV?account service$1M USD$1M USD$1M USD
What is target trade value in base currency?multiply target weight for account by account NLV$100K USD ($1M x 0.1)$100K USD ($1M x 0.1)$100K USD ($1M x 0.1)
What is exchange rate? (if trade currency differs from base currency)account serviceNot applicableUSD.GBP = 0.75Not applicable
What is target trade value in trade currency?multiply target trade value in base currency by exchange rate$100K USD75K GBP ($100K USD x 0.75 USD.GBP)$100K USD
What is market price of security?prices DataFrame$185 USD572 pence (quoted in pence, not pounds)$2690 USD
What is contract multiplier? (applicable to futures and options)securities master serviceNot applicableNot applicable50x
What is price magnifier? (used when prices are quoted in fractional units, for example, pence instead of pounds)securities master serviceNot applicable100 (i.e. 100 pence per pound)Not applicable
What is contract value?contract value = (price x multiplier / price_magnifier)$185 USD57.20 GBP (572 / 100)$134,500 USD (2,690 x 50)
What is target quantity?divide target trade value by contract value540 shares ($100K / $185)1311 shares (75K GBP / 57.20 GBP)1 contract ($100K / $134.5K)
Any current positions held by this strategy?blotter service200 shares0 shares1 contract
What is the required order quantity?subtract current positions from target quantities340 shares (540 - 200)1311 shares (1311 - 0)0 contracts (1 - 1)

Semi-manual vs automated trading

Since Moonshot generates a CSV of orders but doesn't actually place the orders, you can inspect the orders before placing them, if you prefer:

$ quantrocket moonshot trade 'my-strategy' -o orders.csv
$ csvlook -I orders.csv
| ConId     | Account | Action | OrderRef    | TotalQuantity | Exchange | OrderType | Tif |
| --------- | ------- | ------ | ----------- | ------------- | -------- | --------- | --- |
| 265598    | DU12345 | BUY    | my-strategy | 501           | SMART    | MKT       | DAY |
| 3691937   | DU12345 | BUY    | my-strategy | 58            | SMART    | MKT       | DAY |
| 15124833  | DU12345 | BUY    | my-strategy | 284           | SMART    | MKT       | DAY |
| 208813719 | DU12345 | BUY    | my-strategy | 86            | SMART    | MKT       | DAY |

If desired, you can edit the orders inside JupyterLab (right-click on filename > Open With > Editor). When ready, place the orders:

$ quantrocket blotter order -f orders.csv

For automated trading, pipe the orders CSV directly to the blotter over stdin:

$ quantrocket moonshot trade 'my-strategy' | quantrocket blotter order -f '-'

You can schedule this command to run on your countdown service. Be sure to read about collecting and using trading calendars, which enable you to run your trading command conditionally based on whether the market is open:

# Run strategy at 10:30 AM if market is open
30 10 * * mon-fri quantrocket master isopen 'NASDAQ' && quantrocket moonshot trade 'my-strategy' | quantrocket blotter order -f '-'
In the event your strategy produces no orders, the blotter is designed to accept an empty file and simply do nothing.

Schedule live trading

Moonshot does not use a real-time data feed per se for live trading. Rather, it relies on recently updated historical data. How recently depends on the timeframe of your strategy.

For an end of day strategy, you might schedule your historical database to be brought current each evening after the market closes and schedule Moonshot to run after that. Your countdown service crontab might look like this:

# Update history db at 5:30 PM if market was open today
30 17 * * mon-fri quantrocket master isopen 'NASDAQ' --ago '5h' && quantrocket history collect 'nasdaq-eod'

# Run strategy at 9:00 AM if market is open
0 9 * * mon-fri quantrocket master isopen 'NASDAQ' --in '1h' && quantrocket moonshot trade 'eod-strategy' | quantrocket blotter order -f '-'

For an intraday strategy that uses 15-minute bars and enters the market at 10:00 AM based on 9:45 AM prices, you would schedule your historical database to be brought current just after 9:45 AM and schedule Moonshot to run at 10:00 AM. Moonshot will generate orders based on the just-collected 9:45 AM prices.

# Update history db at 9:46 AM if market is open
46 9 * * mon-fri quantrocket master isopen 'ARCA' && quantrocket history collect 'arca-15min'

# Run strategy at 10:00 AM if market is open
0 10 * * mon-fri quantrocket master isopen 'ARCA' && quantrocket moonshot trade 'intraday-strategy' | quantrocket blotter order -f '-'

In the above example, the 15-minute lag between collecting prices and placing orders mirrors the 15-minute bar size used in backtests. For smaller bar sizes, a smaller lag between data collection and order placement would be used.

The time it takes to update your historical data before generating orders imposes a practical limit on the quantity of data you can use in your strategy (quantity of data is a function of universe size and bar granularity). If your database contains too many securities for the bar granularity, it will take longer than the bar duration to update and thus will require you to delay your orders beyond what your backtest simulates.

Review the sections on scheduling and trading calendars to learn more about scheduling your strategies to run.

Trade date validation

In live trading as in backtesting, a Moonshot strategy receives a DataFrame of historical prices and derives DataFrames of signals and target weights. In live trading, orders are created from the last row of the target weights DataFrame. To make sure you're not trading on stale data (for example because your history database hasn't been brought current), Moonshot validates that the target weights DataFrame is up-to-date.

Suppose our target weights DataFrame resembles the following:

>>> target_weights.tail()
            AAPL(265598)  AMZN(3691937)
Date
2018-05-04             0              0
2018-05-07           0.5              0
2018-05-08           0.5              0
2018-05-09             0              0
2018-05-10          0.25           0.25

By default, Moonshot looks for and extracts the row corresponding to today's date in the strategy timezone. (The strategy timezone can be set with the class attribute TIMEZONE and is otherwise inferred from the timezone of the component securities.) Thus, if running the strategy on 2018-05-10, Moonshot would extract the last row from the above DataFrame. If running the strategy on 2018-05-11 or later, Moonshot will fail with the error:

msg: expected signal date 2018-05-11 not found in weights DataFrame, is the underlying
  data up-to-date? (max date is 2018-05-10)
status: error

This default validation behavior is appropriate for intraday strategies as well as end-of-day strategies that run after the market close, in both cases ensuring that today's price history is available to the strategy. However, if your strategy doesn't run until before the market open (for example because you need to collect fundamental data overnight), this validation behavior is too restrictive. In this case, you can set the CALENDAR attribute on the strategy to an exchange code, and that exchange's trading calendar will be used for trade date validation instead of the timezone:

class MyStrategy(Moonshot):
    ...
    CALENDAR = "NYSE"
    ...

Specifying the calendar allows Moonshot to be a little smarter, as it will only enforce the data being updated through the last date the exchange was open. Thus, if the strategy runs when the exchange is open, Moonshot still expects today's date to be in the target weights DataFrame. But if the exchange is currently closed, Moonshot expects the date corresponding to the last date the exchange was open. This allows you to run the strategy before the market open using the prior session's data, while still enforcing that the data is not older than the previous session.

Review orders from earlier dates

At times you may want to bypass trade date validation and generate orders for an earlier date, for testing or troubleshooting purposes. You can pass a --review-date for this purpose:

$ quantrocket moonshot trade 'dma-tech' --review-date '2018-05-09' -o past_orders.csv
>>> from quantrocket.moonshot import trade
>>> trade("dma-tech", review_date="2018-05-09", filepath_or_buffer="past_orders.csv")
$ curl -X POST 'http://houston/moonshot/orders.csv?strategies=dma-tech&review_date=2018-05-09' > past_orders.csv

Exiting positions

There are 3 ways to exit positions in Moonshot:

  1. Exit by rebalancing
  2. Attach exit orders
  3. Close positions with the blotter

Exit by rebalancing

By default, Moonshot calculates an order diff between your target positions and existing positions. This means that previously entered positions will be closed once the target position goes to 0, as Moonshot will generate the closing order needed to achieve the target position. This is a good fit for strategies that periodically rebalance.

Learn more about rebalancing.

Attach exit orders

Sometimes, instead of relying on rebalancing, it's helpful to submit exit orders at the time you submit your entry orders. For example, if your strategy enters the market intraday and exits at market close, it's easiest to submit the entry and exit orders at the same time.

This is referred to as attaching a child order , and can be used for bracket orders , hedging orders , or in this case, simply a pre-planned exit order. The attached order is submitted to IB's system but is only executed if the parent order executes.

Moonshot provides a utility method for creating attached child orders, orders_to_child_orders, which can be used like this:

def order_stubs_to_orders(self, orders, prices):

    # enter using market orders
    orders["Exchange"] = "SMART"
    orders["OrderType"] = "MKT"
    orders["Tif"] = "Day"

    # exit using MOC orders
    child_orders = self.orders_to_child_orders(orders)
    child_orders.loc[:, "OrderType"] = "MOC"

    orders = pd.concat([orders, child_orders])
    return orders

The orders_to_child_orders method creates child orders by copying your orders DataFrame but reversing the Action (BUY/SELL), and linking the child orders to the parent orders via an OrderId column on the parent orders and a ParentId column on the child orders. Interactively, the above example would look like this:

>>> orders.head()
    ConId   Action  TotalQuantity Exchange OrderType  Tif
0   12345      BUY            200    SMART       MKT  Day
1   23456      BUY            400    SMART       MKT  Day
>>> # create child orders from orders
>>> child_orders = self.orders_to_child_orders(orders)
>>> # modify child orders as desired
>>> child_orders.loc[:, "OrderType"] = "MOC"
>>> orders = pd.concat([orders, child_orders])
>>> orders.head()
    ConId   Action  TotalQuantity Exchange OrderType  Tif  OrderId  ParentId
0   12345      BUY            200    SMART       MKT  Day        0       NaN
1   23456      BUY            400    SMART       MKT  Day        1       NaN
0   12345     SELL            200    SMART       MOC  Day      NaN         0
1   23456     SELL            400    SMART       MOC  Day      NaN         1
Note that the OrderId and ParentId generated by Moonshot are not the actual order IDs used by the blotter. The blotter uses OrderId/ParentId (if provided) to identify linked orders but then generates the actual order IDs at the time of order submission to IB.

Close positions with the blotter

A third option for closing positions is to use the blotter to flatten all positions for a strategy. For example, if your strategy enters positions in the morning and exits on the close, you could design the strategy to create the entry orders only, then schedule a command in the afternoon to flatten the positions:

# enter positions in the morning (assuming strategy is designed to create entry orders only)
0 10 * * mon-fri quantrocket master isopen 'TSE' && quantrocket moonshot trade 'canada-intraday' | quantrocket blotter order -f '-'

# exit positions at the close
0 15 * * mon-fri quantrocket blotter close --order-refs 'canada-intraday' --params 'OrderType:MOC' 'Tif:Day' 'Exchange:TSE' | quantrocket blotter order -f '-'

This approach works best in scenarios where you want to flatten all positions in between each successive run of the strategy. Such scenarios can also be handled by attaching exit orders.

Learn more about closing positions with the blotter.

Tick sizes

Price rounding

When placing limit orders, stop orders, or other orders that specify price levels, it is necessary to ensure that the price you submit to IB adheres to the security's tick size rules (also called minimum price increments in IB parlance). This refers to the minimum difference between price levels at which a security can trade.

Some securities have constant price increments at all price levels. For example, most US stocks trade in penny increments. Other securities have difference minimum increments on different exchanges on which they trade and/or different minimum increments at different price levels. For example, these are the tick size rules for orders for MITSUBISHI CORP direct-routed to the Tokyo Stock Exchange:

If price is between...Tick size is...
0 - 1,0000.1
1,000 - 3,0000.5
3,000 - 10,0001
10,000 - 30,0005
30,000 - 100,00010
100,000 - 300,00050
300,000 - 1,000,000100
1,000,000 - 3,000,000500
3,000,000 - 10,000,0001,000
10,000,000 - 30,000,0005,000
30,000,000 -10,000

In contrast, SMART-routed orders for Mitsubishi must adhere to a different, simpler set of tick size rules:

If price is between...Tick size is...
0 - 5,0000.1
5,000 - 100,0001
100,000 -10

Luckily you don't need to keep track of tick size rules as they are stored in the securities master database. You can create your Moonshot orders CSV with unrounded prices then pass the CSV to the master service for price rounding. For example, consider two limit orders for Mitsubishi, one SMART-routed and one direct-routed to TSEJ, with unrounded limit prices of 15203.1135 JPY:

$ csvlook -I orders.csv
| ConId    | Account | Action | OrderRef       | TotalQuantity | Exchange | OrderType | LmtPrice   | Tif |
| -------- | ------- | ------ | -------------- | ------------- | -------- | --------- | ---------- | --- |
| 13905888 | DU12345 | BUY    | japan-strategy | 1000          | SMART    | LMT       | 15203.1135 | DAY |
| 13905888 | DU12345 | BUY    | japan-strategy | 1000          | TSEJ     | LMT       | 15203.1135 | DAY |

If you pass this CSV to the master service and tell it which columns to round, it will round the prices in those columns based on the tick size rules for that ConId and Exchange:

$ quantrocket master ticksize -f orders.csv --round 'LmtPrice' -o rounded_orders.csv
>>> from quantrocket.master import round_to_tick_sizes
>>> round_to_tick_sizes("orders.csv", round_fields=["LmtPrice"], outfilepath_or_buffer="rounded_orders.csv")
$ curl -X GET 'http://houston/master/ticksizes.csv?round_fields=LmtPrice' --upload-file orders.csv > rounded_orders.csv

The SMART-routed order is rounded to the nearest Yen while the TSEJ-routed order is rounded to the nearest 5 Yen, as per the tick size rules. Other columns are returned unchanged:

$ csvlook -I rounded_orders.csv
| ConId    | Account | Action | OrderRef       | TotalQuantity | Exchange | OrderType | LmtPrice | Tif |
| -------- | ------- | ------ | -------------- | ------------- | -------- | --------- | -------- | --- |
| 13905888 | DU12345 | BUY    | japan-strategy | 1000          | SMART    | LMT       | 15203.0  | DAY |
| 13905888 | DU12345 | BUY    | japan-strategy | 1000          | TSEJ     | LMT       | 15205.0  | DAY |

The ticksize command accepts file input over stdin, so you can pipe your moonshot orders directly to the master service for rounding, then pipe the rounded orders to the blotter for submission:

$ quantrocket moonshot trade 'my-japan-strategy' | quantrocket master ticksize -f '-' --round 'LmtPrice' | quantrocket blotter order -f '-'
In the event your strategy produces no orders, the ticksize command, like the blotter, is designed to accept an empty file and simply do nothing.

If you need the actual tick sizes and not just the rounded prices, you can instruct the ticksize endpoint to include the tick sizes in the resulting file:

$ quantrocket master ticksize -f orders.csv --round 'LmtPrice' --append-ticksize -o rounded_orders.csv
>>> from quantrocket.master import round_to_tick_sizes
>>> round_to_tick_sizes("orders.csv", round_fields=["LmtPrice"], append_ticksize=True, outfilepath_or_buffer="rounded_orders.csv")
$ curl -X GET 'http://houston/master/ticksizes.csv?round_fields=LmtPrice&append_ticksize=true' --upload-file orders.csv > rounded_orders.csv

A new column with the tick sizes will be appended, in this case called "LmtPriceTickSize":

$ csvlook -I rounded_orders.csv
| ConId    | Account | Action | OrderRef       | TotalQuantity | Exchange | OrderType | LmtPrice | Tif | LmtPriceTickSize |
| -------- | ------- | ------ | -------------- | ------------- | -------- | --------- | -------- | --- | ---------------- |
| 13905888 | DU12345 | BUY    | japan-strategy | 1000          | SMART    | LMT       | 15203.0  | DAY | 1.0              |
| 13905888 | DU12345 | BUY    | japan-strategy | 1000          | TSEJ     | LMT       | 15205.0  | DAY | 5.0              |

Tick sizes can be used for submitting orders that require price offsets such as Relative/Pegged-to-Primary orders.

Note that for securities with constant price increments, for example US stocks that trade in penny increments, you also have the option of simply rounding the prices in your strategy code using Pandas' round():

def order_stubs_to_orders(self, orders, prices):

    ...
    orders["OrderType"] = "LMT"
    # set limit prices 2% above prior close
    limit_prices = prior_closes * 1.02
    orders["LmtPrice"] = limit_prices.round(2)
    ...

Price offsets

Some orders, such as Relative/Pegged-to-Primary orders, require defining an offset amount using the AuxPrice field. In the case of Relative orders, which move dynamically with the market, the offset amount defines how much more aggressive than the NBBO the order should be.

In some cases, it may suffice to hard-code an offset amount, e.g. $0.01:

def order_stubs_to_orders(self, orders, prices):

    orders["Exchange"] = "SMART"
    orders["OrderType"] = "REL"
    orders["AuxPrice"] = 0.01
    ...

However, as the offset must conform to the security's tick size rules, for some exchanges it's necessary to look up the tick size and use that to define the offset:

import pandas as pd
import io
from quantrocket.master import round_to_tick_sizes
...

def order_stubs_to_orders(self, orders, prices):

    orders["Exchange"] = "SMART"
    orders["OrderType"] = "REL"

    # Temporarily append prior closes to orders DataFrame
    prior_closes = prices.loc["Close"].shift()
    prior_closes = self.reindex_like_orders(prior_closes, orders)
    orders["PriorClose"] = prior_closes

    # Use the ticksize endpoint to get tick sizes based on
    # the latest close
    infile = io.StringIO()
    outfile = io.StringIO()
    orders.to_csv(infile, index=False)
    round_to_tick_sizes(infile, round_fields=["PriorClose"], append_ticksize=True, outfilepath_or_buffer=outfile)
    tick_sizes = pd.read_csv(outfile).PriorCloseTickSize

    # Set the REL offset to 2 tick increments
    orders["AuxPrice"] = tick_sizes * 2

    # Drop temporary column
    orders.drop("PriorClose", axis=1, inplace=True)
    ...

Paper trading

There are several options for testing your trades before you run your strategy on a live account. You can log the trades to flightlog, you can inspect the orders before placing them, and you can trade against your IB paper account.

Log trades to flightlog

After researching and backtesting a strategy in aggregate it's often nice to carefully inspect a handful of actual trades before committing real money. A good option is to start running the strategy but log the trades to flightlog instead of sending them to the blotter:

# Trade (log to flightlog) before the open
0 9 * * mon-fri quantrocket master isopen 'NYSE' --in 1h && quantrocket moonshot trade 'mean-reverter' | quantrocket flightlog log --name 'mean-reverter'

Then manually inspect the trades to see if you're happy with them.

Semi-manual trading

Another option which works well for end-of-day strategies is to generate the Moonshot orders, inspect the CSV file, then manually place the orders if you're happy. See the section on semi-manual trading.

IB Paper trading

You can also paper trade the strategy using your IB paper trading account. To do so, allocate the strategy to your paper account in quantrocket.moonshot.allocations.yml:

DU12345: # paper account numbers start with D
    mystrategy: 0.5

Then add the appropriate command to your countdown crontab, just as you would for a live account.

IB Paper trading limitations

IB paper trading accounts provide a useful way to dry-run your strategy, but it's important to note that IB's paper trading environment is not a full-scale simulation. For example, IB doesn't attempt to simulate certain order types such as on-the-open and on-the-close orders; such orders are accepted by the system but never filled. You may need to work around this limitation by modifying your orders for live vs paper accounts.

Paper trading is primarily useful for validating that your strategy is generating the orders you expect. It's less helpful for seeing what those orders do in the market or performing out-of-sample testing. For that, consider a small allocation to a live account.

See IB's website for a list of paper trading limitations .

Different orders for live vs paper accounts

As some order types aren't supported in IB paper accounts, you can specify different orders for paper vs live accounts:

def order_stubs_to_orders(self, orders, prices):

    orders["OrderType"] = "MKT"
    # Use market-on-open (TIF OPG) orders for live accounts, but
    # vanilla market orders for paper accounts
    orders["Tif"] = "OPG"
    # Paper accounts start with D
    orders.loc[orders.Account.str.startswith("D"), "Tif"] = "DAY"
    ...

Rebalancing

Periodic rebalancing

A Moonshot strategy's prices_to_signals logic will typically calculate signals for each day in the prices DataFrame. However, for many factor model or cross-sectional strategies, you may not wish to rebalance that frequently. For example, suppose our strategy logic ranks stocks every day by momentum and buys the top 10%:

>>> # Calculate 12-month returns
>>> returns = closes.shift(252)/closes - 1
>>> # Rank by return
>>> ranks = returns.rank(axis=1, ascending=False, pct=True)
>>> # Buy the top 10%
>>> signals = (ranks <= 0.1).astype(int)
>>> signals.head()
ConId      123456 234567 ...
Date
2018-05-31      1      0
2018-06-01      0      1
2018-06-02      0      0
2018-06-03      1      0
...
2018-06-30      0      1
2018-07-01      0      1
2018-07-02      1      0

As implemented above, the strategy will trade in and out of positions daily. Instead, we can limit the strategy to monthly rebalancing:

>>> # Resample using the rebalancing interval.
>>> # Keep only the last signal of the month, then fill it forward
>>> # For valid arguments for `resample()`, see:
>>> #     https://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
>>> #     https://pandas.pydata.org/pandas-docs/stable/timeseries.html#anchored-offsets
>>> signals = signals.resample("M").last()
>>> signals = signals.reindex(closes.index, method="ffill")
>>> signals.head()
ConId      123456 234567 ...
Date
2018-05-31      1      0
2018-06-01      1      0
2018-06-02      1      0
2018-06-03      1      0
...
2018-06-30      0      1
2018-07-01      0      1
2018-07-02      0      1

Then, in live trading, to mirror the resampling logic, schedule the strategy to run only on the first trading day of the month:

0 9 * * mon-fri quantrocket master isclosed 'NASDAQ' --since 'M' && quantrocket master isopen 'NASDAQ' --in '1h' && quantrocket moonshot trade 'nasdaq-momentum' | quantrocket blotter order -f '-'

Disabling rebalancing

By default, Moonshot generates orders as needed to achieve your target weights, after taking account of your existing positions. This design is well-suited for strategies that periodically rebalance positions. However, in live trading, this behavior can be suboptimal for strategies that hold multi-day positions which are not intended to be rebalanced. You may wish to disable rebalancing for such strategies.

For example, suppose your strategy calls for holding a 5% position of AAPL for a period of several days. When you enter the position, you account balance is $1M USD and the price of AAPL is $100, so you buy 500 shares ($1M X 0.05 / $100). A day later, your account balance is $1.02M, while the price of AAPL is $97, so Moonshot calculates your target position as 526 shares ($1.02M X 0.05 / $97) and create an order to buy 26 shares (526 - 500). The following day, your account balance is unchanged at $1.02M but the price of AAPL is $98.50, resulting in a target position of 518 shares and a net order to sell 8 shares (518 - 526). Day-to-day changes in the share price and/or your account balance result in small buy or sell orders for the duration of the position.

These small rebalancing orders are problematic because they incur slippage and commissions which are not reflected in a backtest. In a backtest, the position is maintained at a constant weight of 5% so there are no day-to-day transaction costs. Thus, the daily rebalancing orders will introduce hidden costs into live performance compared to backtested performance.

You can disable rebalancing for a strategy using the ALLOW_REBALANCE parameter:

class MultiDayStrategy(Moonshot):

    ...
    ALLOW_REBALANCE = False

When ALLOW_REBALANCE is set to False, Moonshot will not create orders to rebalance a position which is already on the correct side (long or short). Moonshot will still create orders as needed to open a new position, close an existing position, or change sides (long to short or short to long). When ALLOW_REBALANCE is True (the default), Moonshot creates orders as needed to achieve the target weight.

You can also use a decimal percentage with ALLOW_REBALANCE to allow rebalancing only when the target position is sufficiently different from the existing position size. For example, don't rebalance unless the position size will change by at least 25%:

class MultiDayStrategy(Moonshot):

    ...
    ALLOW_REBALANCE = 0.25

In this example, if the target position size is 600 shares and the current position size is 500 shares, the rebalancing order will be suppressed because 100/500 < 0.25. If the target position is 300 shares, the rebalancing order will be allowed because 200/500 > 0.25.

By disabling rebalancing, your commissions and slippage will mirror your backtest. However, your live position weights will fluctuate and differ somewhat from the constant weights of your backtest, and as a result your live returns will not match your backtest returns exactly. This is often a good trade-off because the discrepancy in position weights (and thus returns) is usually two-sided (i.e. sometimes in your favor, sometimes not) and thus roughly nets out, while the added transaction costs of daily rebalancing is a one-sided cost that degrades live performance.

Algorithmic orders

IB provides various algorithmic order types which can be helpful for working large orders into the market. In fact, if you submit a market order that is too big based on the security's liquidity, IB might reject the order with this message:

quantrocket.blotter: WARNING ibg2 client 6001 got IB error code 202: Order Canceled - reason:In accordance with our regulatory obligations, we have rejected this order because it is too large compared to the liquidity that is generally available for this product. If you would like to submit an order of this size, please submit an algorithmic order (such as VWAP, TWAP, or Percent of Volume)

IB historical data for the default TRADES bar type includes a Wap field, which is defined by IB as "the VWAP over trade data filtered for some trade types (combos, derivatives, odd lots, block trades)".

>>> prices = get_historical_prices("usa-stk-1d", fields=["Wap"])
>>> vwaps = prices.loc["Wap"]

This makes it possible to use the Wap field to calculate returns in your backtest, then use IB's "Vwap" order algo in live trading (or a similar order algo) to mirror your backtest.

VWAP for end-of-day strategies

For an end-of-day strategy, the relevant example code for a backtest is shown below:

class UpMinusDown(Moonshot):

    ...
    # ask for Wap field (not included by default)
    DB_FIELDS = ["Wap", "Volume", "Close"]
    ...

    def positions_to_gross_returns(self, positions, prices):
        # enter at the next day's VWAP
        vwaps = prices.loc["Wap"]
        # The return is the security's percent change over the period following
        # position entry, multiplied by the position.
        gross_returns = vwaps.pct_change() * positions.shift()
        return gross_returns

Here, we are modeling our orders being filled at the next day's VWAP. Then, for live trading, create orders using IB's VWAP algo:

class UpMinusDown(Moonshot):

    ...
    def order_stubs_to_orders(self, orders, prices):

        # Enter using IB Vwap algo
        orders["OrderType"] = "MKT"
        orders["AlgoStrategy"] = "Vwap"
        orders["Tif"] = "DAY"
        orders["Exchange"] = "SMART"
        return orders

If placed before the market open, IB will seek to fill this order over the course of the day at the day's VWAP, thus mirroring our backtest.

VWAP for intraday strategies

VWAP orders can also be modeled and used on an intraday timeframe. For example, suppose we are using 30-minute bars and want to enter and exit positions gradually between 3:00 and 3:30 PM. In backtesting, we can use the 15:00:00 Wap:

class IntradayStrategy(Moonshot):

    ...
    # ask for Wap field (not included by default)
    DB_FIELDS = ["Wap", "Volume", "Close"]
    ...

    def positions_to_gross_returns(self, positions, prices):
        # get the 15:00-15:30 VWAP
        vwaps = prices.loc["Wap"].xs("15:00:00", level="Time")
        # The return is the security's percent change over the day following
        # position entry, multiplied by the position.
        gross_returns = vwaps.pct_change() * positions.shift()
        return gross_returns

Then, for live trading, run the strategy at 15:00:00 and instruct IB to finish the VWAP orders by 15:30:00:

class IntradayStrategy(Moonshot):

    ...
    def order_stubs_to_orders(self, orders, prices):

        # Enter using IB Vwap algo
        orders["OrderType"] = "MKT"
        orders["AlgoStrategy"] = "Vwap"
        # Format timestamp as expected by IB: yyyymmdd hh:mm:ss
        # IB doesn't handle all pytz timezone aliases, so best to convert to UTC/GMT
        now = pd.Timestamp.now("America/New_York")
        end_time = now.replace(hour=15, minute=30, second=0)
        end_time_str = end_time.astimezone("UTC").strftime("%Y%m%d %H:%M:%S GMT")
        orders["AlgoParams_endTime"] = end_time_str
        orders["AlgoParams_allowPastEndTime"] = 1
        orders["Tif"] = "DAY"
        orders["Exchange"] = "SMART"
        return orders

Algo parameters

In the IB API, algorithmic orders are specified by the AlgoStrategy field, with additional algo parameters specified in the AlgoParams fields (algo parameters are optional or required depending on the algo). The AlgoParams field is a nested field which expects a list of multiple algo-specific parameters ; since the orders CSV (and the DataFrame it derives from) is a flat-file format, these nested parameters can be specified using underscore separators, e.g. AlgoParams_maxPctVol:

def order_stubs_to_orders(self, orders, prices):

    # Enter using IB Vwap algo
    orders["AlgoStrategy"] = "Vwap"
    orders["AlgoParams_maxPctVol"] = 0.1
    orders["AlgoParams_noTakeLiq"] = 1

    ...

Moonshot snippets

These snippets are meant to be useful and suggestive as starting points, but they may require varying degrees of modification to conform to the particulars of your strategy.

Multi-day holding periods

One way to implement multi-day holding periods is to forward-fill signals with a limit:

def signals_to_target_weights(self, signals, prices):

    # allocate 5% of capital to each position
    target_weights = self.allocate_fixed_weights(signals, 0.05)

    # Hold for 2 additional periods after the signal (3 periods total)
    weights = weights.where(weights!=0).fillna(method="ffill", limit=2)
    weights.fillna(0, inplace=True)

    return weights

Limit orders

To use limit orders in a backtest, you can model whether they get filled in target_weights_to_positions. For example, suppose we generate signals after the close and place orders to enter on the open the following day using limit orders set 1% above the prior close for BUYs and 1% below the prior close for SELLs:

def target_weights_to_positions(self, weights, prices):

        # enter the day after the signal
        positions = weights.shift()

        # calculate limit prices
        prior_closes = prices.loc["Close"].shift()
        buy_limit_prices = prior_closes * 1.01
        sell_limit_prices = prior_closes * 0.99

        # see where the stock opened on the day of the position
        opens = prices.loc["Open"]
        buy_orders = positions > 0
        sell_orders = positions < 0
        opens_below_buy_limit = opens < buy_limit_prices
        opens_above_sell_limit = opens > sell_limit_prices

        # zero out positions that don't get filled
        # (Note: For simplicity, this design is suitable for strategies with
        # 1-day holding periods; for multi-day holding periods, additional logic
        # would be needed to distinguish position entry dates and only apply
        # limit price filters based on the position entry dates.)
        gets_filled = (buy_orders & opens_below_buy_limit) | (sell_orders & opens_above_sell_limit)
        positions = positions.where(gets_filled, 0)

        return positions

For live trading, create the corresponding order parameters in order_stubs_to_orders:

def order_stubs_to_orders(self, orders, prices):

    prior_closes = prices.loc["Close"].shift()
    prior_closes = self.reindex_like_orders(prior_closes, orders)

    buy_limit_prices = prior_closes * 1.01
    sell_limit_prices = prior_closes * 0.99

    buy_orders = orders.Action == "BUY"
    sell_orders = ~buy_orders
    orders["LmtPrice"] = None
    orders.loc[buy_orders, "LmtPrice"] = buy_limit_prices.loc[buy_orders]
    orders.loc[sell_orders, "LmtPrice"] = sell_limit_prices.loc[sell_orders]

    ...

GoodAfterTime orders

Place market orders that won't become active until 3:55 PM:

def order_stubs_to_orders(self, orders, prices):

    now = pd.Timestamp.now(self.TIMEZONE)
    good_after_time = now.replace(hour=15, minute=55, second=0)
    # Format timestamp as expected by IB: yyyymmdd hh:mm:ss
    # IB doesn't handle all pytz timezone aliases, so best to convert to UTC/GMT
    good_after_time_str = good_after_time.astimezone("UTC").strftime("%Y%m%d %H:%M:%S GMT")
    orders["GoodAfterTime"] = good_after_time_str
    ...

Early close

For intraday strategies that use the session close bar for rolling calculations, early close days can interfere with the rolling calculations by introducing NaNs. Below, with 15-minute data, calculate 50-day moving average by using the early close bar when the close bar is missing:

session_closes = prices.loc["Close"].xs("15:45:00", level="Time")

# Fill missing closing prices with early close prices
early_close_session_closes = prices.loc["Close"].xs("12:45:00", level="Time")
session_closes.fillna(early_close_session_closes, inplace=True)

mavgs = session_closes.rolling(window=50).mean()

The scheduling section contains examples of scheduling live trading around early close days.

Zipline

Zipline and pyfolio are open-source libraries for running backtests and analyzing algorithm performance. Both libraries are developed by Quantopian. QuantRocket makes it easy to run Zipline backtests using historical data from QuantRocket's history service and view a pyfolio tear sheet of the results.

Data ingestion

To run a Zipline backtest using data from a QuantRocket history database, the first step is to collect the historical data, and the second step is to "ingest", or import, the historical data into Zipline's native format. Ingested data is referred to as a "data bundle."

Initial ingestion

You can ingest 1-day or 1-minute history databases (the two bar sizes Zipline supports). Let's ingest historical data for AAPL so we can run the Zipline demo strategy.

First, assume we've already collected 1-day bars for AAPL, like so:

$ # get the listing...
$ quantrocket master listings --exchange NASDAQ --symbols AAPL
status: the listing details will be collected asynchronously
$ # monitor flightlog for listing details to be collected, then make a universe:
$ quantrocket master get -e NASDAQ -s AAPL | quantrocket master universe 'just-aapl' -f -
code: just-aapl
inserted: 1
provided: 1
total_after_insert: 1
$ # get 1 day bars for AAPL
$ quantrocket history create-db 'aapl-1d' --universes 'just-aapl' --bar-size '1 day'
status: successfully created quantrocket.history.aapl-1d.sqlite
$ quantrocket history collect 'aapl-1d'
status: the historical data will be collected asynchronously

After the historical data request finishes, we can ingest our historical data into Zipline:

$ quantrocket zipline ingest --history-db 'aapl-1d' --calendar 'NYSE'
msg: successfully ingested aapl-1d bundle
status: success
>>> from quantrocket.zipline import ingest_bundle
>>> ingest_bundle(history_db="aapl-1d", calendar="NYSE")
{'status': 'success', 'msg': 'successfully ingested aapl-1d bundle'}
$ curl -X POST 'http://houston/zipline/bundles?history_db=aapl-1d&calendar=NYSE'
{"status": "success", "msg": "successfully ingested aapl-1d bundle"}
The calendar option is required the first time you ingest data. Calendars are important in Zipline, and choosing a calendar that doesn't align well with your data can lead to confusing error messages. The above data bundle will use Zipline's NYSE calendar. Or you can associate your data bundle with a different Zipline calendar:
$ # to see available Zipline calendars, you can pass an invalid value:
$ quantrocket zipline ingest --history-db 'london-stk-1min' --calendar ?
msg: 'unknown calendar ''?'', choices are: BMF, CFE, CME, ICE, LSE, NYSE, TSX, us_futures'
status: error
$ quantrocket zipline ingest --history-db 'london-stk-1min' --calendar 'LSE'
msg: successfully ingested london-stk-1min bundle
status: success
>>> # to see available Zipline calendars, you can pass an invalid value:
>>> ingest_bundle(history_db="london-stk-1min", calendar="?")
{'status': 'error', 'msg': 'unknown calendar ?, choices are: BMF, CFE, CME, ICE, LSE, NYSE, TSX, us_futures'}
>>> ingest_bundle(history_db="london-stk-1min", calendar="LSE")
{'status': 'success', 'msg': 'successfully ingested london-stk-1min bundle'}
$ curl -X POST 'http://houston/zipline/bundles?history_db=london-stk-1min&calendar=?'
{"status": "error", "msg": "unknown calendar ?, choices are: BMF, CFE, CME, ICE, LSE, NYSE,TSX, us_futures"}
$ curl -X POST 'http://houston/zipline/bundles?history_db=london-stk-1min&calendar=LSE'
{"status": "success", "msg": "successfully ingested london-stk-1min bundle"}
You can optionally ingest a subset of the history database, filtering by date range, universe, or conid. For example, you might import only a single year of a large 1-minute database:
$ quantrocket zipline ingest --history-db 'usa-stk-1min' -s '2017-01-01' -e '2017-12-31' --calendar 'NYSE'
msg: successfully ingested usa-stk-1min bundle
status: success
>>> ingest_bundle(history_db="usa-stk-1min",
                  start_date="2017-01-01", end_date="2017-12-31",
                  calendar="NYSE")
{'status': 'success', 'msg': 'successfully ingested usa-stk-1min bundle'}
$ curl -X POST 'http://houston/zipline/bundles?history_db=usa-stk-1min&start_date=2017-01-01&end_date=2018-01-01&calendar=NYSE'
{"status": "success", "msg": "successfully ingested usa-stk-1min bundle"}
By default the history database code is used as the bundle name, but you can optionally assign a bundle name. Assigning a bundle name allows you to separately ingest multiple subsets of the same database:
$ quantrocket zipline ingest --history-db 'usa-stk-1min' --universes 'nyse-stk' --bundle 'nyse-stk-1min' --calendar 'NYSE'
msg: successfully ingested nyse-stk-1min bundle
status: success
$ quantrocket zipline ingest --history-db 'usa-stk-1min' --universes 'nasdaq-stk' --bundle 'nasdaq-stk-1min' --calendar 'NYSE'
msg: successfully ingested nasdaq-stk-1min bundle
status: success
>>> ingest_bundle(history_db="usa-stk-1min",
                  universes="nyse-stk",
                  bundle="nyse-stk-1min", calendar="NYSE")
{'status': 'success', 'msg': 'successfully ingested nyse-stk-1min bundle'}
>>> ingest_bundle(history_db="usa-stk-1min",
                  universes="nasdaq-stk",
                  bundle="nasdaq-stk-1min", calendar="NYSE")
{'status': 'success', 'msg': 'successfully ingested nasdaq-stk-1min bundle'}
$ curl -X POST 'http://houston/zipline/bundles?history_db=usa-stk-1min&universes=nyse-stk&bundle=nyse-stk-1min&calendar=NYSE'
{"status": "success", "msg": "successfully ingested nyse-stk-1min bundle"}
$ curl -X POST 'http://houston/zipline/bundles?history_db=usa-stk-1min&universes=nasdaq-stk&bundle=nasdaq-stk-1min&calendar=NYSE'
{"status": "success", "msg": "successfully ingested nasdaq-stk-1min bundle"}

Re-ingesting data

After you update your history database with new data, you can re-ingest the database into Zipline using the same API:

$ quantrocket zipline ingest --history-db 'aapl-1d'
msg: successfully ingested aapl-1d bundle
status: success
>>> from quantrocket.zipline import ingest_bundle
>>> ingest_bundle(history_db="aapl-1d")
{'status': 'success', 'msg': 'successfully ingested aapl-1d bundle'}
$ curl -X POST 'http://houston/zipline/bundles?history_db=aapl-1d'
{"status": "success", "msg": "successfully ingested aapl-1d bundle"}
The calendar and any date range or universe filters that you specified during the initial ingestion will be used for the re-ingestion as well. If you need to change the calendar or filters, you must first completely remove the existing bundle:
$ quantrocket zipline clean -b 'aapl-1d' --all
aapl-1d:
- /root/.zipline/data/aapl-1d/2018-10-05T14;07;51.482331
>>> from quantrocket.zipline import clean_bundles
>>> clean_bundles(bundles=["aapl-1d"], clean_all=True)
{'aapl-1d': ['/root/.zipline/data/aapl-1d/2018-10-05T14;07;51.482331']}
$ curl -X DELETE 'http://houston/zipline/bundles?bundles=aapl-1d&keep_last=1'
{"aapl-1d": ["/root/.zipline/data/aapl-1d/2018-10-05T14;07;51.482331"]}
The --all/clean_all option removes all ingestions for the bundle and also deletes the stored bundle configuration. Then, you can ingest the database again with the correct calendar or filters.

Bundle cleanup

Data re-ingestion is not incremental. That is, new data is not appended to earlier data. Rather, the entire database (or the subset based on your filters, if applicable) is ingested each time you run the ingest function. Re-ingested data does not replace the earlier ingested data; rather, Zipline stores each ingestion as a new version of the bundle. By default the most recent ingestion is used when you run a backtest.

You can list your bundles and see the different versions you've ingested:

$ quantrocket zipline bundles
aapl-1d:
- '2018-10-05 14:16:12.246592'
- '2018-10-05 14:07:51.482331'
london-stk-1min:
- '2018-10-05 14:20:11.241632'
>>> from quantrocket.zipline import list_bundles
>>> list_bundles()
{'aapl-1d': ['2018-10-05 14:16:12.246592',
  '2018-10-05 14:07:51.482331'],
 'london-stk-1min': ['2018-10-05 14:20:11.241632']}
$ curl -X GET 'http://houston/zipline/bundles'
{"aapl-1d": ["2018-10-05 14:16:12.246592","2018-10-05 14:07:51.482331"], "london-stk-1min": ["2018-10-05 14:20:11.241632"]}
And you can remove old ingestions:
$ # remove all but most recent ingestion; output shows ingestions removed
$ quantrocket zipline clean -b 'aapl-1d' --keep-last 1
aapl-1d:
- /root/.zipline/data/aapl-1d/2018-10-05T14;07;51.482331
>>> # remove all but most recent ingestion; output shows ingestions removed
>>> from quantrocket.zipline import clean_bundles
>>> clean_bundles(bundles=["aapl-1d"], keep_last=1)
{'aapl-1d': ['/root/.zipline/data/aapl-1d/2018-10-05T14;07;51.482331']}
$ # remove all but most recent ingestion; output shows ingestions removed
$ curl -X DELETE 'http://houston/zipline/bundles?bundles=aapl-1d&keep_last=1'
{"aapl-1d": ["/root/.zipline/data/aapl-1d/2018-10-05T14;07;51.482331"]}

Suppose you update a history database each evening and want to re-ingest it into Zipline each time. To avoid filling up your hard drive with all the bundle ingestions, you might schedule the following commands:

# update 1-min database of NYSE stocks each evening at 5 PM (if the market was open today)
0 17 * * mon-fri quantrocket master isopen 'NYSE' --ago '6H' && quantrocket history collect 'nyse-stk-1min' --priority

# purge existing bundle and re-ingest into Zipline at 11 PM
0 23 * * mon-fri quantrocket zipline clean -b 'nyse-stk-1min' --all && quantrocket zipline ingest --history-db 'nyse-stk-1min' --calendar 'NYSE'

Backtesting

Zipline provides the following demo file of a dual moving average crossover strategy using AAPL:

# dual_moving_average.py

from zipline.api import order_target_percent, record, symbol, set_benchmark

def initialize(context):
    context.sym = symbol('AAPL')
    set_benchmark(symbol('AAPL'))
    context.i = 0

def handle_data(context, data):
    # Skip first 300 days to get full windows
    context.i += 1
    if context.i < 300:
        return

    # Compute averages
    # history() has to be called with the same params
    # from above and returns a pandas dataframe.
    short_mavg = data.history(context.sym, 'price', 100, '1d').mean()
    long_mavg = data.history(context.sym, 'price', 300, '1d').mean()

    # Trading logic
    if short_mavg > long_mavg:
        # order_target_percent orders as many shares as needed to
        # achieve the desired percent allocation.
        order_target_percent(context.sym, 0.2)
    elif short_mavg < long_mavg:
        order_target_percent(context.sym, 0)

    # Save values for later inspection
    record(AAPL=data.current(context.sym, "price"),
           short_mavg=short_mavg,
           long_mavg=long_mavg)

Place this file in the 'zipline' subdirectory inside Jupyter.

Next, run the backtest from a notebook, specifying the bundle you ingested earlier, and save the backtest results to a CSV:

from quantrocket.zipline import run_algorithm
run_algorithm("dual_moving_average.py", data_frequency="daily",
              bundle="aapl-1d",
              start="2000-01-01", end="2017-01-01",
              filepath_or_buffer="aapl_results.csv")

You can plot the backtest results using pyfolio:

import pyfolio as pf
pf.from_zipline_csv("aapl_results.csv")

You can also load the backtest results into a DataFrame:

>>> from quantrocket.zipline import ZiplineBacktestResult
>>> result = ZiplineBacktestResult.from_csv("aapl_results.csv")
>>> result.perf.iloc[-1]
column
algorithm_period_return                                             0.140275
benchmark_period_return                                               2.6665
capital_used                                                        -21976.8
ending_cash                                                      9.15718e+06
ending_exposure                                                  2.24557e+06
ending_value                                                     2.24557e+06
excess_return                                                              0
gross_leverage                                                      0.196932
long_exposure                                                    2.24557e+06
long_value                                                       2.24557e+06
longs_count                                                                1
max_drawdown                                                      -0.0993747
max_leverage                                                        0.214012
net_leverage                                                        0.196932
orders                     [{'filled': 199, 'limit': None, 'commission': ...}]
period_close                                       2014-12-31 21:00:00+00:00
period_label                                                         2014-12
period_open                                        2014-12-31 14:31:00+00:00
pnl                                                                 -43121.5
portfolio_value                                                  1.14028e+07
positions                  [{'last_sale_price': 110.38, 'sid': Equity(265...)}]
returns                                                          -0.00376743
...

Backtesting via CLI

You can also use the command line to run a backtest and generate a PDF tear sheet:

$ quantrocket zipline run --bundle 'aapl-1d' -f 'dual_moving_average.py' -s '2000-01-01' -e '2017-01-01' -o aapl_results.csv
$ quantrocket zipline tearsheet aapl_results.csv -o aapl_results.pdf

Open the PDF and have a look:

zipline pyfolio tearsheet

Fundamental data in Zipline

QuantRocket provides access to the Reuters Worldwide Fundamentals dataset via Zipline's Pipeline API. First collect the data into your QuantRocket database as described in the fundamental data section of the usage guide.

To use the fundamental data in Pipeline, import the ReutersFinancials Pipeline dataset (for annual financial reports) or the ReutersInterimFinancials dataset (for interim/quarterly financial reports) from the zipline_extensions package provided by QuantRocket. You can reference any of the available financial statement indicator codes and use them to build a custom Pipeline factor. (See the fundamental data section of the usage guide for help looking up the codes.)

Below, we create a custom Pipeline factor that calculates price-to-book ratio.

from zipline.pipeline import Pipeline, CustomFactor
from zipline.pipeline.data import USEquityPricing
# zipline_extensions is provided by QuantRocket
from zipline_extensions.pipeline.data import ReutersFinancials # or ReutersInterimFinancials

# Create a price-to-book custom pipeline factor
class PriceBookRatio(CustomFactor):
    """
    Custom factor that calculates price-to-book ratio.

    First, calculate book value per share, defined as:

        (Total Assets - Total Liabilities) / Number of shares outstanding

    The codes we'll use for these metrics are 'ATOT' (Total Assets),
    'LTLL' (Total Liabilities), and 'QTCO' (Total Common Shares Outstanding).

    Price-to-book ratio is then calculated as:

        closing price / book value per share
    """
    inputs = [
        USEquityPricing.close, # despite the name, this works fine for non-US equities too
        ReutersFinancials.ATOT, # total assets
        ReutersFinancials.LTLL, # total liabilities
        ReutersFinancials.QTCO # common shares outstanding
    ]
    window_length = 1

    def compute(self, today, assets, out, closes, tot_assets, tot_liabilities, shares_out):
        book_values_per_share = (tot_assets - tot_liabilities)/shares_out
        pb_ratios = closes/book_values_per_share
        out[:] = pb_ratios

Now we can use our custom factor in our Pipeline:

pipe = Pipeline()
pb_ratios = PriceBookRatio()
pipe.add(pb_ratios, 'pb_ratio')

A demo strategy utilizing financial statements is available in the codeload-demo repository.

See Zipline's documentation for more on using the Pipeline API.

Other Backtesters

QuantRocket makes it easy to integrate other Python backtesters. You can tell QuantRocket what packages to install, you can use QuantRocket's Python client to pull historical and fundamental data into your strategy code, and you can use the CLI to run your backtests. You get the benefit of QuantRocket's infrastructure and data services together with the freedom and flexibility to choose the backtester best suited to your particular strategy.

As an example, we'll show how to connect the open-source Python backtesting framework backtrader to QuantRocket.

Define your satellite service

First, we add a satellite service to our Docker Compose file and tell QuantRocket what packages to install on it. The service name should consist of alphanumerics and hyphens, and should begin with 'satellite'. We'll name our backtrader service 'satellite':

# docker-compose.yml
services:
    ...
    satellite:
        image: 'quantrocket/satellite:latest'
        volumes_from:
            - codeload
        environment:
            PIP_INSTALL: 'backtrader>=1.9'

The satellite service is QuantRocket's extensible service for bringing outside packages and tools into the QuantRocket solar system. The quantrocket/satellite Docker image ships with Anaconda, Python 3, and the QuantRocket client. We can instruct the service to install additional Python packages by specifying an environment variable called PIP_INSTALL which should contain a space-separated string of Python packages. If needed, we can also install Debian packages by specifying an APT_INSTALL environment variable, but we don't need this for our example.

Now that we've defined our service, we can launch our service using Docker Compose:

$ docker-compose -f path/to/docker-compose.yml -p quantrocket up -d satellite

Write your strategy code

Let's write a basic moving average strategy for backtrader using AAPL stock. First, assume we've already collected 1-day bars for AAPL, like so:

$ # get the listing...
$ quantrocket master listings --exchange NASDAQ --symbols AAPL
status: the listing details will be collected asynchronously
$ # monitor flightlog for listing details to be collected, then make a universe:
$ quantrocket master get -e NASDAQ -s AAPL | quantrocket master universe 'just-aapl' -f -
code: just-aapl
inserted: 1
provided: 1
total_after_insert: 1
$ # get 1 day bars for AAPL
$ quantrocket history create-db 'aapl-1d' --universes 'just-aapl' --bar-size '1 day'
status: successfully created quantrocket.history.aapl-1d.sqlite
$ quantrocket history collect 'aapl-1d'
status: the historical data will be collected asynchronously

Now that we have historical data for AAPL, we can use it in backtrader by downloading a CSV and creating our backtrader data feed from it. The relevant snippet is shown below:

import backtrader.feeds as btfeeds
from quantrocket.history import download_history_file

# Create data feed using QuantRocket data and add to backtrader
# (Put files in /tmp to have QuantRocket automatically clean them out after
# a few hours)
download_history_file(
    'aapl-1d',
    filepath_or_buffer='/tmp/aapl-1d.csv',
    fields=['ConId','Date','Open','Close','High','Low','Volume'])

data = btfeeds.GenericCSVData(
    dataname='/tmp/aapl-1d.csv',
    dtformat=('%Y-%m-%d'),
    datetime=1,
    open=2,
    close=3,
    high=4,
    low=5,
    volume=6
)
cerebro.adddata(data)

A backtest commonly ends by plotting a performance chart, but since our code will be running in a headless Docker container, we should save the plot to a file (which we'll tell QuantRocket to return to us when we run the backtest):

# Save the plot to PDF so the satellite service can return it (make sure
# to use the Agg backend)
cerebro.plot(use='Agg', savefig=True, figfilename='/tmp/backtrader-plot.pdf')

A complete, working strategy is shown below:

# dual_moving_average.py

import backtrader as bt
import backtrader.feeds as btfeeds
from quantrocket.history import download_history_file

class DualMovingAverageStrategy(bt.SignalStrategy):

    params = (
        ('smavg_window', 100),
        ('lmavg_window', 300),
    )

    def __init__(self):

        # Compute long and short moving averages
        smavg = bt.ind.SMA(period=self.p.smavg_window)
        lmavg = bt.ind.SMA(period=self.p.lmavg_window)

        # Go long when short moving average is above long moving average
        self.signal_add(bt.SIGNAL_LONG, bt.ind.CrossOver(smavg, lmavg))

if __name__ == '__main__':

    cerebro = bt.Cerebro()

    # Create data feed using QuantRocket data and add to backtrader
    # (Put files in /tmp to have QuantRocket automatically clean them out after
    # a few hours)
    download_history_file(
        'aapl-1d',
        filepath_or_buffer='/tmp/aapl-1d.csv',
        fields=['ConId','Date','Open','Close','High','Low','Volume'])

    data = btfeeds.GenericCSVData(
        dataname='/tmp/aapl-1d.csv',
        dtformat=('%Y-%m-%d'),
        datetime=1,
        open=2,
        close=3,
        high=4,
        low=5,
        volume=6
    )
    cerebro.adddata(data)

    cerebro.addstrategy(DualMovingAverageStrategy)
    cerebro.run()

    # Save the plot to PDF so the satellite service can return it (make sure
    # to use the Agg backend)
    cerebro.plot(use='Agg', savefig=True, figfilename='/tmp/backtrader-plot.pdf')

Place this file in your codeload volume, which we mounted inside the satellite service above. Inside the satellite service, the codeload volume will be mounted at /codeload. Reminder: for local deployments, you probably mapped the codeload service to a directory on your host machine containing your code and config files; you'll place the algo file in this directory. For cloud deployments, you probably told codeload to pull your code and config from a Git repo; you'll place the algo file in your Git repo.

Run your backtests

We can now run our backtest from the QuantRocket client. The API for the satellite service lets us execute an arbitrary command and optionally return a file. In our case, we'll execute our algo script and tell QuantRocket to return the PDF performance chart that our script will create.

$ # We've placed our dual_moving_average.py script in a 'backtrader' folder in our
$ # codeload volume, and the codeload volume is mounted inside the Docker
$ # container at /codeload, so the path to our script inside the container is
$ # /codeload/backtrader/dual_moving_average.py
$ quantrocket satellite exec 'python /codeload/backtrader/dual_moving_average.py' --return-file '/tmp/backtrader-plot.pdf' --outfile 'backtrader-plot.pdf'
$ # now we can have a look at backtrader-plot.pdf
>>> from quantrocket.satellite import execute_command
>>> # We've placed our dual_moving_average.py script in a 'backtrader' folder in our
>>> # codeload volume, and the codeload volume is mounted inside the Docker
>>> # container at /codeload, so the path to our script inside the container is
>>> # /codeload/backtrader/dual_moving_average.py
>>> execute_command("python /codeload/backtrader/dual_moving_average.py",
                    return_file="/tmp/backtrader-plot.pdf",
                    filepath_or_buffer="backtrader-plot.pdf")
>>> # now we can have a look at backtrader-plot.pdf
$ # We've placed our dual_moving_average.py script in a 'backtrader' folder in our
$ # codeload volume, and the codeload volume is mounted inside the Docker
$ # container at /codeload, so the path to our script inside the container is
$ # /codeload/backtrader/dual_moving_average.py
$ curl -X POST 'http://houston/satellite/commands?cmd=python%20%2Fcodeload%2Fbacktrader%2Fdual_moving_average.py&return_file=%2Ftmp%2Fbacktrader-plot.pdf' > backtrader-plot.pdf
$ # now we can have a look at backtrader-plot.pdf

Scheduling

You can use QuantRocket's cron service, named "countdown," to schedule automated tasks such as collecting historical data or running your trading strategies.

You can pick the timezone in which you want to schedule your tasks, and you can create as many countdown services as you like. If you plan to trade in multiple timezones, consider creating a separate countdown service for each timezone where you will trade.

When scheduling cron jobs, it's easiest to schedule the jobs in the timezone of the exchange they relate to. For example, if you want to download stock loan data for Australian stocks every day at 9:45 AM before the market opens at 10:00 AM local time, it's better to schedule this in Sydney time than in, say, New York time. Scheduling in New York time would require you to adjust the crontab times several times per year whenever there is a daylight savings changes in New York or Sydney. By scheduling the cron job in Sydney time, you never have to worry about this. If you also have other cron jobs that need to be anchored to another timezone, run a separate countdown service for those jobs.

Add a countdown service

To add a countdown service to an existing deployment, use the configuration wizard to define the name and timezone of your countdown service:

Countdown configuration wizard

Copy the block of YAML for the countdown service from the configuration wizard and paste it at the bottom of your Docker Compose or Stack file. As an example, if you define a countdown service running in New York time, the block of YAML might look like this:

countdown-newyork:
  image: 'quantrocket/countdown:1.1.0'
  environment:
    SERVICE_NAME: countdown-newyork
    TZ: America/New_York
  volumes_from:
    - codeload

You can then deploy the new service. For local deployments:

$ cd /path/to/docker-compose.yml
$ docker-compose -p quantrocket up -d

Create your crontab

You can create and edit your crontab within the Jupyter environment. The countdown service uses a naming convention to recognize and load the correct crontab. In the above example of a countdown service named countdown-newyork, the service will look for and load a crontab named quantrocket.countdown-newyork.crontab. The expected filename is displayed in the configuration wizard when you first define the service. This file should be created in the top-level of your codeload volume, that is, in the top level of your Jupyter file browser.

create crontab

After you create the file, you can add cron jobs as on a standard crontab. An example crontab is shown below:

# Crontab syntax cheat sheet
# .------------ minute (0 - 59)
# |   .---------- hour (0 - 23)
# |   |   .-------- day of month (1 - 31)
# |   |   |   .------ month (1 - 12) OR jan,feb,mar,apr ...
# |   |   |   |   .---- day of week (0 - 6) (Sunday=0 or 7)  OR sun,mon,tue,wed,thu,fri,sat
# |   |   |   |   |
# *   *   *   *   *   command to be executed

# Collect historical data Monday-Friday evenings at 5:30 PM
30 17 * * 1-5 quantrocket history collect 'nasdaq-1d'
# Collect fundamental data on Sunday afternoons
0 14 * * 7 quantrocket fundamental collect-financials -u 'nasdaq'

Each time you edit the crontab, the corresponding countdown service will detect the change and reload the file.

Validate your crontab

Whenever you save your crontab, it's a good idea to have flightlog open (quantrocket flightlog stream) so you can check that it was successfully loaded by the countdown service:

2018-02-21 09:31:57 quantrocket.countdown-newyork: INFO Successfully loaded quantrocket.countdown-newyork.crontab

If there are syntax errors in the file, it will be rejected (a common error is failing to include an empty line at the bottom of the crontab):

2018-02-21 09:32:38 quantrocket.countdown-newyork: ERROR quantrocket.countdown-newyork.crontab is invalid, please correct the errors:
2018-02-21 09:32:38 quantrocket.countdown-newyork: ERROR     new crontab file is missing newline before EOF, cannot install.
2018-02-21 09:32:38 quantrocket.countdown-newyork: ERROR

You can also use the client to print out the crontab installed in your container so you can verify that it is as expected:

$ quantrocket countdown crontab countdown-newyork
>>> from quantrocket.countdown import get_crontab
>>> get_crontab("countdown-newyork")
$ curl -X GET 'http://houston/countdown-newyork/crontab'

Monitor cron errors

Assuming your crontab is free of syntax errors and loaded successfully, there might still be errors when your commands run and you will want to know about those. You can monitor flightlog for this purpose, as any errors returned by the unattended commands will be logged to flightlog. Setting up flightlog's Papertrail integration works well for this purpose as it allows you to monitor anywhere and set up alerts.

Generally, errors will be logged to flightlog's application (non-detailed) logs. The exception is that if you misspell "quantrocket" or call a program that doesn't exist, the error message will only show up in flightlog's detailed logs:

$ quantrocket flightlog get --detailed /tmp/system.log
$ tail /tmp/system.log
quantrocket_countdown-newyork_1|Date: Tue, 24 Apr 2018 13:04:01 -0400
quantrocket_countdown-newyork_1|
quantrocket_countdown-newyork_1|/bin/sh: 1: quantrockettttt: not found
quantrocket_countdown-newyork_1|

In addition to error output, flightlog's detailed logs will log all output from your cron jobs. The output will be formatted as text emails because this is the format the cron utility uses.

Trading calendars

Collect trading calendars

You can collect upcoming trading hours for the exchanges you trade and use them in your scheduling. First, make sure you've already collected listings for the exchange(s) you care about:

$ quantrocket master listings --exchange 'NYSE' --sec-types 'STK'
status: the listing details will be collected asynchronously
>>> from quantrocket.master import collect_listings
>>> collect_listings(exchange="NYSE", sec_types=["STK"])
{'status': 'the listing details will be collected asynchronously'}
$ curl -X POST 'http://houston/master/listings?exchange=NYSE&sec_types=STK'
{"status": "the listing details will be collected asynchronously"}
Once the listings are saved to your database, you're ready to collect the exchange hours:
$ quantrocket master collect-calendar
status: the trading hours will be collected asynchronously
>>> from quantrocket.master import collect_calendar
>>> collect_calendar()
{'status': 'the trading hours will be collected asynchronously'}
$ curl -X POST 'http://houston/master/calendar'
{"status": "the trading hours will be collected asynchronously"}
This will collect trading hours for all exchanges in your securities master database. Optionally, you can limit by exchange:
$ quantrocket master collect-calendar -e 'NYSE'
status: the trading hours will be collected asynchronously
>>> from quantrocket.master import collect_calendar
>>> collect_calendar(exchanges=["NYSE"])
{'status': 'the trading hours will be collected asynchronously'}
$ curl -X POST 'http://houston/master/calendar?exchanges=NYSE'
{"status": "the trading hours will be collected asynchronously"}

Trading hours for the next month are returned by the IB API; this means you need to re-run the command periodically. You can add it to one of your countdown service crontabs:

# Collect upcoming trading hours weekdays at 3 AM
0 3 * * mon-fri quantrocket master collect-calendar
The IB API provides trading hours by security, but for simplicity QuantRocket stores trading hours by exchange. QuantRocket selects a sampling of securities for each exchange and requests trading hours for those securities.

Query trading hours

Once you've collected trading hours for an exchange, you can query to see if the exchange is open or closed. You'll get the status (open or closed) as well as when the status took effect and when it will next change:

$ quantrocket master calendar 'NYSE'
NYSE:
  since: '2018-05-10T09:30:00'
  status: open
  timezone: America/New_York
  until: '2018-05-10T16:00:00'
>>> from quantrocket.master import list_calendar_statuses
>>> list_calendar_statuses(["NYSE"])
{'NYSE': {'since': '2018-05-10T09:30:00',
  'status': 'open',
  'timezone': 'America/New_York',
  'until': '2018-05-10T16:00:00'}}
$ curl 'http://houston/master/calendar?exchanges=NYSE'
{"NYSE": {"status": "open", "since": "2018-05-10T09:30:00", "until": "2018-05-10T16:00:00", "timezone": "America/New_York"}}
By default the exchange's current status is returned, but you can also check what the exchange status was in the past (using a Pandas timedelta string):
$ quantrocket master calendar 'NYSE' --ago '12h'
NYSE:
  since: '2018-05-09T16:00:00'
  status: closed
  timezone: America/New_York
  until: '2018-05-10T09:30:00'
>>> from quantrocket.master import list_calendar_statuses
>>> list_calendar_statuses(["NYSE"], ago="12h")
{'NYSE': {'since': '2018-05-09T16:00:00',
  'status': 'closed',
  'timezone': 'America/New_York',
  'until': '2018-05-10T09:30:00'}}
$ curl 'http://houston/master/calendar?exchanges=NYSE&ago=12h'
{"NYSE": {"status": "closed", "since": "2018-05-09T16:00:00", "until": "2018-05-10T09:30:00", "timezone": "America/New_York"}}
Or what the exchange status will be in the future:
$ quantrocket master calendar 'NYSE' --in '30min'
NYSE:
  since: '2018-05-10T16:00:00'
  status: closed
  timezone: America/New_York
  until: '2018-05-11T09:30:00'
>>> from quantrocket.master import list_calendar_statuses
>>> list_calendarstatuses(["NYSE"], in="30min")
{'NYSE': {'since': '2018-05-10T16:00:00',
  'status': 'closed',
  'timezone': 'America/New_York',
  'until': '2018-05-11T09:30:00'}}
$ curl 'http://houston/master/calendar?exchanges=NYSE&in=30min'
{"NYSE": {"status": "closed", "since": "2018-05-10T16:00:00", "until": "2018-05-11T09:30:00", "timezone": "America/New_York"}}

Conditional scheduling with isopen / isclosed

The most common use of trading calendars in QuantRocket is to conditionally schedule commands that run on the countdown service. Conditional scheduling is accomplished using quantrocket master isopen and quantrocket master isclosed. For example, we could schedule a NASDAQ history database to be updated only if the NASDAQ was open today:

# Update history db at 5:30 PM if market was open today
30 17 * * mon-fri quantrocket master isopen 'NASDAQ' --ago '5h' && quantrocket history collect 'nasdaq-eod'

quantrocket master isopen and quantrocket master isclosed are used as true/false assertions: they don't print any output but return an exit code of 0 (indicating success) if the condition is met and an exit code of 1 (indicating failure) if it is not met. In shell, a double-ampersand (&&) between commands indicates that the second command will only run if the preceding command returns a 0 exit code. Thus, in the above example, if the NASDAQ was open 5 hours ago, the historical data command will run; if the NASDAQ wasn't open, it won't.

The --in and --ago options allow you to check the exchange status in the past or future; if omitted, the command checks the current exchange status. The --in/-ago options accept any string that can be passed to pd.Timedelta.

To get the feel of using isopen/isclosed, you can open a terminal and try the commands in conjunction with echo:

$ # if the exchange assertion is true, you'll see the printed output, otherwise not
$ quantrocket master isopen 'GLOBEX' --in '1h' && echo "assertion passed"

Generally, live trading commands should always be prefixed with an appropriate isopen/isclosed:

# Run strategy at 9:00 AM if market will be open today
0 9 * * mon-fri quantrocket master isopen 'NASDAQ' --in '1h' && quantrocket moonshot trade 'my-strategy' | quantrocket blotter order -f '-'

# Run intraday strategy at 10:30 AM if market is open
30 10 * * mon-fri quantrocket master isopen 'NASDAQ' && quantrocket moonshot trade 'my-intraday-strategy' | quantrocket blotter order -f '-'

You can chain together multiple isopen/isclosed for more complex conditions. The following example shows how to run a strategy at 12:45 PM on early close days and at 3:45 PM on regular days:

# Trade at 12:45 PM on early close days
45 12 * * mon-fri quantrocket master isopen 'ARCA' && quantrocket master isclosed 'ARCA' --in '1h' && quantrocket moonshot trade 'my-etf-strategy' | quantrocket blotter order -f '-'
# Trade at 3:45 PM on regular trading days
45 15 * * mon-fri quantrocket master isopen 'ARCA' && quantrocket moonshot trade 'my-etf-strategy' | quantrocket blotter order -f '-'

Using the --since and --until options, you can schedule commands to run only at the beginning (or end) of the month, quarter, etc. This can be useful for strategies that periodically rebalance:

# Rebalance before the Tokyo open on the first trading day of the quarter
30 8 * * mon-fri quantrocket master isopen 'TSEJ' --in '1h' && quantrocket master isclosed 'TSEJ' --since 'Q' && quantrocket moonshot trade 'monthly-strategy' | quantrocket blotter order -f '-'

# Trade a window dressing strategy at 3:45 PM on the last trading day of the month
45 15 * * mon-fri quantrocket master isopen 'NYSE' && quantrocket master isclosed 'NYSE' --in '1h' --until 'M' && quantrocket moonshot trade 'window-dressing' | quantrocket blotter order -f '-'

# Trade a strategy before the open on the first trading day of the week (if Monday
# is a holiday, the strategy will run on Tuesday, for example)
0 9 * * mon-fri quantrocket master isclosed 'NYSE' --since 'W' && quantrocket master isopen 'NYSE' --in 1h && quantrocket moonshot trade 'umd-us' | quantrocket blotter order -f -

The --since/--until options are applied after --in/--ago, if both are specified. For example, quantrocket master isclosed 'NYSE' --in '1h' --until 'M' asserts that the NYSE will be closed in 1 hour and will remained closed through month end. The --since/--until options accept a Pandas offset alias or anchored offset , or more broadly any string that can be passed as the freq argument to pd.date_range.

Account Monitoring

QuantRocket keeps track of your IB account balances and of exchange rates between your IB base currency and other currencies you might trade. You can also check your IB portfolio in real-time.

IB account balances

You can query your latest account balance through QuantRocket without having to open Trader Workstation. IB provides many account-related fields, so you might want to limit which fields are returned. This will check your Net Liquidation Value (IB's term for your account balance):

$ quantrocket account balance --latest --fields 'NetLiquidation' | csvlook
| Account   | Currency | NetLiquidation |         LastUpdated |
| --------- | -------- | -------------- | ------------------- |
| DU12345   | USD      |     500,000.00 | 2018-02-02 22:57:13 |
>>> from quantrocket.account import download_account_balances
>>> import io
>>> import pandas as pd
>>> f = io.StringIO()
>>> download_account_balances(f, latest=True, fields=["NetLiquidation"])
>>> balances = pd.read_csv(f, parse_dates=["LastUpdated"])
>>> balances.head()
   Account Currency  NetLiquidation         LastUpdated
0  DU12345      USD        500000.0 2018-02-02 22:57:13
$ curl 'http://houston/account/balances.csv?latest=true&fields=NetLiquidation'
Account,Currency,NetLiquidation,LastUpdated
DU12345,USD,500000.0,2018-02-02 22:57:13

Using the CLI, you can filter the output to show only accounts where the margin cushion is below 5%, and log the results (if any) to flightlog:

$ quantrocket account balance --latest --below 'Cushion:0.05' --fields 'NetLiquidation' 'Cushion' | quantrocket flightlog log --name 'quantrocket.account' --level 'CRITICAL'

If you've set up Twilio alerts for CRITICAL messages, you can add this command to the crontab on one of your countdown services, and you'll get a text message whenever you're at risk of auto-liquidation by IB. If no accounts are below the cushion, nothing will be logged.

Account balance history

Whenever you're connected to IB, QuantRocket pings IB every few minutes and saves your latest account balance details to your database. One reading per day (if available) is retained permanently to provide a historical record of your account balances over time. This is used by the blotter for performance tracking. You can download a CSV of your available account balance history:

$ quantrocket account balance --outfile balances.csv
>>> from quantrocket.account import download_account_balances
>>> download_account_balances("balances.csv")
>>> balances = pd.read_csv("balances.csv")
$ curl 'http://houston/account/balances.csv' > balances.csv

IB portfolio

You can check your current IB portfolio without logging into Trader Workstation:

$ quantrocket account portfolio | csvlook -I
| Account  | ConId     | Description              | Position   | UnrealizedPnl | RealizedPnl | MarketPrice  | ...
| -------- | --------- | ------------------------ | ---------- | ------------- | ----------- | ------------ |
| DU123456 | 255253337 | MXP FUT @GLOBEX 20180618 | -1.0       | 1173.72       | 0.0         | 0.0504276    |
| DU123456 | 35045199  | USD.MXN CASH @IDEALPRO   | -24402.0   | 11960.16      | 0.0         | 19.7354698   |
| DU123456 | 185291219 | WALMEX STK @MEXI         | 165.0      | 796.8         | 0.0         | 48.92274855  |
| DU123456 | 253190540 | EWI STK @ARCA            | 109.0      | -2.03         | 0.0         | 32.38597105  |
>>> from quantrocket.account import download_account_portfolio
>>> import io
>>> f = io.StringIO()
>>> download_account_portfolio(f)
>>> portfolio = pd.read_csv(f, parse_dates=["LastUpdated"])
>>> portfolio.head()
     Account      ConId               Description  Position  UnrealizedPnl  RealizedPnl  MarketPrice  ...
0  DU123456  255253337  MXP FUT @GLOBEX 20180618      -1.0         1173.72          0.0     0.050428
1  DU123456   35045199    USD.MXN CASH @IDEALPRO  -24402.0        12368.15          0.0    19.718750
2  DU123456  185291219          WALMEX STK @MEXI     165.0          796.80          0.0    48.922749
3  DU123456  253190540             EWI STK @ARCA     109.0           -2.03          0.0    32.385971
$ curl -X GET 'http://houston/account/portfolio.csv' | csvlook -I
| Account  | ConId     | Description              | Position   | UnrealizedPnl | RealizedPnl | MarketPrice  | ...
| -------- | --------- | ------------------------ | ---------- | ------------- | ----------- | ------------ |
| DU123456 | 255253337 | MXP FUT @GLOBEX 20180618 | -1.0       | 1173.72       | 0.0         | 0.0504276    |
| DU123456 | 35045199  | USD.MXN CASH @IDEALPRO   | -24402.0   | 11960.16      | 0.0         | 19.7354698   |
| DU123456 | 185291219 | WALMEX STK @MEXI         | 165.0      | 796.8         | 0.0         | 48.92274855  |
| DU123456 | 253190540 | EWI STK @ARCA            | 109.0      | -2.03         | 0.0         | 32.38597105  |

The portfolio is a basic snapshot of what is visible in TWS. Checking your portfolio requires IB Gateway to be connected and is mainly intended to be used when you can't log in to Trader Workstation because your login is being used by IB Gateway. Only the current portfolio is available; historical performance tracking is provided separately by QuantRocket's blotter.

Exchange rates

To support currency conversions between your base currency and other currencies you might trade, QuantRocket collects daily exchange rates and stores them in your database. Exchange rates come from the European Central Bank, which updates them each business day at 4 PM CET.

You probably won't need to query the exchange rates directly very often, but you can if needed. You can check the latest exchange rates:

$ quantrocket account rates --latest | csvlook -I
| BaseCurrency | QuoteCurrency | Rate    | Date       |
| ------------ | ------------- | ------- | ---------- |
| USD          | AUD           | 1.2774  | 2018-01-09 |
| USD          | CAD           | 1.2425  | 2018-01-09 |
| USD          | CHF           | 0.98282 | 2018-01-09 |
...
>>> from quantrocket.account import download_exchange_rates
>>> import io
>>> import pandas as pd
>>> f = io.StringIO()
>>> download_exchange_rates(f, latest=True)
>>> rates = pd.read_csv(f, parse_dates=["Date"])
>>> rates.head()
  BaseCurrency QuoteCurrency      Rate       Date
0          USD           AUD   1.2774  2018-01-09
1          USD           CAD   1.2425  2018-01-09
2          USD           CHF   0.98282 2018-01-09
...
$ curl 'http://houston/account/rates.csv?latest=true'
BaseCurrency,QuoteCurrency,Rate,Date
USD,AUD,1.2774,2018-01-09
USD,CAD,1.2425,2018-01-09
USD,CHF,0.98282,2018-01-09
...
Or download a CSV of all exchange rates stored in your database:
$ quantrocket account rates --outfile rates.csv
>>> from quantrocket.account import download_exchange_rates
>>> download_exchange_rates("rates.csv")
>>> rates = pd.read_csv("rates.csv")
$ curl 'http://houston/account/rates.csv' > rates.csv

Orders and Positions

You can use QuantRocket's blotter service to place, monitor, and cancel orders, track open positions, and record and analyze live trading performance.

In trading terminology, a "blotter" is a detailed log or record of orders and executions. In QuantRocket the blotter is not only used for tracking orders but for placing orders as well.

Place orders

You can place orders from a CSV or JSON file, or directly from the CLI or Python client. A CSV of orders should have one order per row:

$ # Orders for AAPL (ConId 265598) and AMZN (ConId 3691937) stock
$ csvlook -I orders.csv
| ConId   | Account  | Action | OrderRef | TotalQuantity | Exchange | OrderType | Tif |
| ------- | -------- | ------ | -------- | ------------- | -------- | --------- | --- |
| 265598  | DU123456 | BUY    | dma-tech | 500           | SMART    | MKT       | DAY |
| 3691937 | DU123456 | BUY    | dma-tech | 50            | SMART    | MKT       | DAY |

For live trading, Moonshot produces a CSV of orders similar to the above example. A JSON file of orders can also be used and should consist of an array of orders:

$ # Orders for AAPL (ConId 265598) and AMZN (ConId 3691937) stock
$ cat orders.json
[
    {
        "ConId": 265598,
        "Account": "DU123456",
        "Action": "BUY",
        "OrderRef": "dma-tech",
        "TotalQuantity": 500,
        "Exchange": "SMART",
        "OrderType": "MKT",
        "Tif": "DAY"
    },
    {
        "ConId": 3691937,
        "Account": "DU123456",
        "Action": "BUY",
        "OrderRef": "dma-tech",
        "TotalQuantity": 50,
        "Exchange": "SMART",
        "OrderType": "MKT",
        "Tif": "DAY"
    }
]

Use the blotter to place the orders in the file. The order IDs will be returned:

$ quantrocket blotter order -f orders.csv # or orders.json
6001:25
6001:26
>>> from quantrocket.blotter import place_orders
>>> place_orders(infilepath_or_buffer="orders.csv") # or orders.json
['6001:25', '6001:26']
$ curl -X POST 'http://houston/blotter/orders' --upload-file orders.csv # or orders.json
["6001:25", "6001:26"]

Instead of submitting a pre-made file of orders, you can also create orders directly in Python:

>>> from quantrocket.blotter import place_orders
>>> orders = []
>>> order1 = {
        "ConId": 265598,
        "Account": "DU123456",
        "Action": "BUY",
        "OrderRef": "dma-tech",
        "TotalQuantity": 500,
        "Exchange": "SMART",
        "OrderType": "MKT",
        "Tif": "DAY"
    }
>>> orders.append(order1)
>>> order2 = {
        "ConId": 3691937,
        "Account": "DU123456",
        "Action": "BUY",
        "OrderRef": "dma-tech",
        "TotalQuantity": 50,
        "Exchange": "SMART",
        "OrderType": "MKT",
        "Tif": "DAY"
    }
>>> orders.append(order2)
>>> order_ids = place_orders(orders)

Alternatively, you can place an order by specifying the order parameters directly on the command line. This approach is limited to placing one order at a time but is useful for testing and experimentation as well as one-off orders:

$ # order 500 shares of AAPL
$ quantrocket blotter order --params 'ConId:265598' 'Action:BUY' 'Exchange:SMART' 'TotalQuantity:500' 'OrderType:MKT' 'Tif:DAY' 'Account:DU123456' 'OrderRef:dma-tech'
6001:27

Order fields

IB offers a large assortment of order types and algos. Learn about the available order types on IB's website, and refer to the IB API documentation for API example orders and a full list of possible order parameters . It can be helpful to manually create an order in Trader Workstation to familiarize yourself with the order attributes before trying to create the order via the API.

Order fields in QuantRocket should always use UpperCamelCase, that is, a concatenation of capitalized words, e.g. "OrderType". (Within the IB API documentation itself you will sometimes see UpperCamelCase and sometimes lowerCamelCase depending on the programming language.)

Required fields

The following fields are required when placing an order:

  • ConId: the unique contract identifier for the security/instrument
  • Action: "BUY" or "SELL"
  • TotalQuantity: the number of shares or contracts to order
  • OrderType: the order type, e.g. "MKT" or "LMT"
  • Tif: the time-in-force, e.g. "DAY" or "GTC" (good-till-canceled)
  • OrderRef: a user-defined identifier used to associate the order with a trading strategy
  • Exchange: the exchange to route the order to (not necessarily the primarily listing exchange), e.g. "SMART" or "NYSE". To see the available exchanges for a security, check the ValidExchanges field in the master file (quantrocket master get), or use Trader Workstation.
  • Account: the account number is required if connected to multiple accounts, as explained below

Specifying the account number in the Account field is a best practice and is required if IB Gateway is connected to more than one account. (Moonshot order CSVs always include the Account field.) If Account is not specified and the blotter (via the IB Gateway services) is only connected to one account, that account will be used. If Account is not specified and the blotter is connected to multiple accounts, the orders will be rejected:

$ quantrocket blotter order --params 'ConId:265598' 'Action:BUY' 'Exchange:SMART' 'TotalQuantity:500' 'OrderType:MKT' 'Tif:Day' 'OrderRef:dma-tech'
msg: 'no account specified and cannot infer because multiple accounts connected (connected
  accounts: DU12345,U12345; order:
  {"ConId": "265598", "Action": "BUY", "Exchange": "SMART", "TotalQuantity": "500",
  "OrderType": "MKT", "Tif": "Day", "OrderRef": "dma-tech"}'
status: error

The OrderRef field

IB provides the OrderRef field to allow users to assign arbitrary labels to orders for the user's own tracking purposes. In QuantRocket, the OrderRef field is required as it is used to associate orders with a particular trading strategy. For orders generated by Moonshot, the strategy code (e.g. "dma-tech") is used as the order ref. This enables the blotter to track positions and performance on a strategy-by-strategy basis.

Order IDs

When you place orders, the blotter generates and returns unique order IDs for each order:

$ quantrocket blotter order -f orders.csv
6001:25
6001:26
>>> from quantrocket.blotter import place_orders
>>> place_orders(infilepath_or_buffer="orders.csv")
['6001:25', '6001:26']
$ curl -X POST 'http://houston/blotter/orders' --upload-file orders.csv
["6001:25", "6001:26"]

Orders IDs are used internally by the blotter and can be used to check order statuses or cancel orders. You can also check order statuses or cancel orders based on other lookups such as the order ref, account, or conid, so it is typically not necessary to hold on to the order IDs.

Order IDs take the form of <ClientId:OrderNum>, where ClientId is the ID used by the blotter (the client) to connect to the IB API, and OrderNum is an auto-incrementing.

Parent-child orders, aka attached orders

IB provides the concept of attached orders , whereby a "parent" and "child" order are submitted to IB at the same time, but IB only activates the child order and submits it to the exchange if the parent order executes. Attached orders can be used for bracket orders and hedging orders , and can also be used in Moonshot to attach exit orders to entry orders.

Submitting an attached order requires adding a ParentId attribute to the child order, which should be set to the OrderId of the parent order. The following example CSV includes a market order to BUY 100 shares of AAPL, as well as a child order to sell 100 shares of AAPL at the close.

$ csvlook -I parent_child_orders.csv
| ConId  | Account  | Action | OrderRef  | TotalQuantity | Exchange | OrderType | Tif | OrderId | ParentId |
| ------ | -------- | ------ | --------- | ------------- | -------- | --------- | --- | ------- | -------- |
| 265598 | DU123456 | BUY    | strategy1 | 100           | SMART    | MKT       | DAY | 1       |          |
| 265598 | DU123456 | SELL   | strategy1 | 100           | SMART    | MOC       | DAY |         | 1        |

The ParentId of the second order links as a child order to the OrderId of the first order. Note that the OrderId and ParentId fields in your orders file are not the actual order IDs used by the blotter. The blotter uses OrderId/ParentId (if provided) to identify linked orders but then generates the actual order IDs at the time of order submission to IB. Therefore any number can be used for the OrderId/ParentId as long as they are unique within the file.

The parent order must precede the child order in the orders file.
The blotter expects parent-child orders to be submitted within the same file. Attaching child orders to parent orders that were placed at a previous time is not supported.

IB execution algos

IB provides various execution algos which can be helpful for working large orders into the market. In the IB API, these are specified by the AlgoStrategy and AlgoParams fields. The AlgoParams field is a nested field which expects a list of multiple algo-specific parameters. When submitting orders via a JSON file or directly via Python, the AlgoParams can be provided in a nested format. Here is an example of a VWAP order:

>>> orders = []
>>> order1 = {
        "ConId": 265598,
        "Account": "DU123456",
        "Action": "BUY",
        "OrderRef": "dma-tech",
        "TotalQuantity": 10000,
        "Exchange": "SMART",
        "OrderType": "LMT",
        "LmtPrice": 104.30,
        "AlgoStrategy": "Vwap",
        "AlgoParams": {
            "maxPctVol": 0.1,
            "noTakeLiq": 1,
        },
        "Tif": "DAY"
    }
>>> orders.append(order1)
>>> place_orders(orders)

Since CSV is a flat-file format, a CSV orders file requires a different syntax for AlgoParams. Algo parameters can be specified using underscore separators, e.g. AlgoParams_maxPctVol:

$ csvlook -I vwap_orders.csv
| ConId  | Account  | Action | OrderRef | TotalQuantity | AlgoStrategy | AlgoParams_maxPctVol | AlgoParams_noTakeLiq | ...
| ------ | -------- | ------ | -------- | ------------- | ------------ | -------------------- | -------------------- |
| 265598 | DU123456 | BUY    | dma-tech | 10000         | Vwap         | 0.1                  | 1                    |
In the above example, carefully note that AlgoParams is UpperCamelCase like other order fields, but the nested parameters (e.g. maxPctVol) are lowerCamelCase.

Order status

You can check order statuses based on a variety of lookups including the order ref, account, conid, order ID, or date range the order was submitted. For example, you could check the order statuses of all orders associated with a particular order ref and submitted on or after a particular date (such as today's date):

$ quantrocket blotter status -r 'my-strategy' -s '2018-05-18' | csvlook -I
| OrderId | ConId  | Action | TotalQuantity | Account  | OrderRef    | Status       | Filled | Remaining | ...
| ------- | ------ | ------ | ------------- | -------- | ----------- | ------------ | ------ | --------- |
| 6001:61 | 265598 | BUY    | 100           | DU123456 | my-strategy | Filled       | 100    | 0         |
| 6001:62 | 265598 | SELL   | 100           | DU123456 | my-strategy | PreSubmitted | 0      | 100       |
>>> from quantrocket.blotter import download_order_statuses
>>> import io
>>> f = io.StringIO()
>>> download_order_statuses(f, order_refs=["my-strategy"], start_date="2018-05-18")
>>> statuses = pd.read_csv(f, parse_dates=["Submitted"])
>>> statuses.head()
   OrderId           Submitted   ConId Action  TotalQuantity   Account     OrderRef        Status  Filled  Remaining  Errors
0  6001:61 2018-05-18 18:10:29  265598    BUY            100  DU123456  my-strategy        Filled     100          0     NaN
1  6001:62 2018-05-18 18:10:29  265598   SELL            100  DU123456  my-strategy  PreSubmitted       0        100     NaN
$ curl -X GET 'http://houston/blotter/orders.csv?order_refs=my-strategy&start_date=2018-05-18' | csvlook -I
| OrderId | ConId  | Action | TotalQuantity | Account  | OrderRef    | Status       | Filled | Remaining | ...
| ------- | ------ | ------ | ------------- | -------- | ----------- | ------------ | ------ | --------- |
| 6001:61 | 265598 | BUY    | 100           | DU123456 | my-strategy | Filled       | 100    | 0         |
| 6001:62 | 265598 | SELL   | 100           | DU123456 | my-strategy | PreSubmitted | 0      | 100       |
You'll see the order status as well as the shares filled and shares remaining. Open orders as well as completed orders are included. Optionally, you can show open orders only (this filter can also be combined with other filters):
$ quantrocket blotter status --open | csvlook -I
| OrderId | ConId     | Action | TotalQuantity | Account  | OrderRef        | Status       | Filled | Remaining | ...
| ------- | --------- | ------ | ------------- | -------- | --------------- | ------------ | ------ | --------- |
| 6001:62 | 265598    | SELL   | 100           | DU123456 | my-strategy     | PreSubmitted | 0      | 100       |
| 6001:64 | 269745169 | BUY    | 1             | DU123456 | es-fut-daytrade | Submitted    | 0      | 1         |
>>> f = io.StringIO()
>>> download_order_statuses(f, open_orders=True)
>>> statuses = pd.read_csv(f, parse_dates=["Submitted"])
>>> statuses.head()
   OrderId           Submitted      ConId Action  TotalQuantity   Account         OrderRef        Status  Filled  Remaining  Errors
0  6001:62 2018-05-18 18:10:29     265598   SELL            100  DU123456      my-strategy  PreSubmitted       0        100     NaN
1  6001:64 2018-05-18 18:33:08  269745169    BUY              1  DU123456  es-fut-daytrade     Submitted       0          1     NaN
$ curl -X GET 'http://houston/blotter/orders.csv?open_orders=true' | csvlook -I
| OrderId | ConId     | Action | TotalQuantity | Account  | OrderRef        | Status       | Filled | Remaining | ...
| ------- | --------- | ------ | ------------- | -------- | --------------- | ------------ | ------ | --------- |
| 6001:62 | 265598    | SELL   | 100           | DU123456 | my-strategy     | PreSubmitted | 0      | 100       |
| 6001:64 | 269745169 | BUY    | 1             | DU123456 | es-fut-daytrade | Submitted    | 0      | 1         |
You can request that additional order fields be returned:
$ # request OrderType and LmtPrice in output
$ # (Tip: if CSV becomes too wide for terminal, try requesting json and using json2yaml)
$ quantrocket blotter status --order-ids '6001:64' --fields 'OrderType' 'LmtPrice' --json | json2yaml
---
  -
    OrderId: "6001:64"
    Submitted: "2018-05-18T18:33:08+00:00"
    ConId: 269745169
    Action: "BUY"
    TotalQuantity: 1
    Account: "DU123456"
    OrderRef: "es-fut-daytrade"
    LmtPrice: 2000
    OrderType: "LMT"
    Status: "Submitted"
    Filled: 0
    Remaining: 1
    Errors: null
>>> f = io.StringIO()
>>> # request OrderType and LmtPrice in output
>>> download_order_statuses(f, order_ids=["6001:64"], fields=["OrderType", "LmtPrice"])
>>> statuses = pd.read_csv(f, parse_dates=["Submitted"])
>>> statuses.to_dict(orient="records")
[{'Account': 'DU123456',
  'Action': 'BUY',
  'ConId': 269745169,
  'Errors': nan,
  'Filled': 0,
  'LmtPrice': 2000.0,
  'OrderId': '6001:64',
  'OrderRef': 'es-fut-daytrade',
  'OrderType': 'LMT',
  'Remaining': 1,
  'Status': 'Submitted',
  'Submitted': Timestamp('2018-05-18 18:33:08'),
  'TotalQuantity': 1}]
$ # request OrderType and LmtPrice in output
$ # (Tip: if CSV becomes too wide for terminal, try requesting json and using json2yaml)
$ curl -X GET 'http://houston/blotter/orders.json?order_ids=6001%3A64&fields=OrderType&fields=LmtPrice' | json2yaml
---
  -
    OrderId: "6001:64"
    Submitted: "2018-05-18T18:33:08+00:00"
    ConId: 269745169
    Action: "BUY"
    TotalQuantity: 1
    Account: "DU123456"
    OrderRef: "es-fut-daytrade"
    LmtPrice: 2000
    OrderType: "LMT"
    Status: "Submitted"
    Filled: 0
    Remaining: 1
    Errors: null
Because there are many possible order parameters and because IB periodically adds new parameters, not every order parameter is saved to its own field in the blotter database. Order parameters which aren't saved to their own field are saved in JSON format to a common field called OrderDetailsJson. You can pass a "?" or any invalid fieldname to see the list of available fields; if the field you want is missing, it's stored in OrderDetailsJson:
$ # check available fields
$ quantrocket blotter status --field '?'
msg: 'unknown order status fields: ? (available fields are: Account, Action, AdjustableTrailingUnit,
  AdjustedStopLimitPrice, AdjustedStopPrice, AdjustedTrailingAmount, AlgoId, AlgoStrategy,
  AllOrNone, AuxPrice, BlockOrder, ClientId, ConId, DiscretionaryAmt, DisplaySize,
  Errors, Exchange, FaGroup, FaMethod, FaPercentage, FaProfile, Filled, GoodAfterTime,
  GoodTillDate, Hidden, LmtPrice, LmtPriceOffset, MinQty, NotHeld, OcaGroup, OcaType,
  OpenClose, OrderDetailsJson, OrderId, OrderNum, OrderRef, OrderType, Origin, OutsideRth,
  ParentId, PercentOffset, PermId, Remaining, Status, Submitted, SweepToFill, Tif,
  TotalQuantity, TrailStopPrice, TrailingPercent, Transmit, TriggerMethod, TriggerPrice,
  WhatIf'
status: error
$ # Look at the AlgoParams field on a Vwap order; it doesn't have its own
$ # field so it's stored in OrderDetailsJson
$ quantrocket blotter status -d '6001:65' --fields 'AlgoStrategy' 'OrderDetailsJson' --json | json2yaml
---
  -
    OrderId: "6001:65"
    Submitted: "2018-05-18T19:02:25+00:00"
    ConId: 265598
    Action: "BUY"
    TotalQuantity: 10000
    Account: "DU123456"
    OrderRef: "my-strategy"
    OrderDetailsJson:
      AlgoParams:
        maxPctVol: 0.1
        noTakeLiq: 0
    AlgoStrategy: "Vwap"
    Status: "Submitted"
    Filled: 4000
    Remaining: 6000
    Errors: null
>>> f = io.StringIO()
>>> # check available fields
>>> download_order_statuses(f, fields=["?"])
HTTPError: ('400 Client Error: BAD REQUEST for url: http://houston/blotter/orders.csv?fields=%3F', {'status': 'error', 'msg': 'unknown order status fields: ? (available fields are: Account,Action, AdjustableTrailingUnit, AdjustedStopLimitPrice, AdjustedStopPrice, AdjustedTrailingAmount, AlgoId, AlgoStrategy, AllOrNone, AuxPrice, BlockOrder, ClientId, ConId, DiscretionaryAmt, DisplaySize, Errors, Exchange, FaGroup, FaMethod, FaPercentage, FaProfile, Filled, GoodAfterTime, GoodTillDate, Hidden, LmtPrice, LmtPriceOffset, MinQty, NotHeld, OcaGroup, OcaType, OpenClose, OrderDetailsJson, OrderId, OrderNum, OrderRef, OrderType, Origin, OutsideRth, ParentId, PercentOffset, PermId, Remaining, Status, Submitted, SweepToFill, Tif, TotalQuantity, TrailStopPrice, TrailingPercent, Transmit, TriggerMethod, TriggerPrice, WhatIf'})
>>> # Look at the AlgoParams field on a Vwap order; it doesn't have its own
>>> # field so it's stored in OrderDetailsJson
>>> download_order_statuses(f, order_ids=["6001:65"], fields=["AlgoStrategy", "OrderDetailsJson"])
>>> statuses = pd.read_csv(f, parse_dates=["Submitted"])
>>> statuses.iloc[0]
OrderId                                                       6001:65
Submitted                                         2018-05-18 19:02:25
ConId                                                          265598
Action                                                            BUY
TotalQuantity                                                    1000
Account                                                      DU123456
OrderRef                                                  my-strategy
OrderDetailsJson   {'AlgoParams': {'maxPctVol': 0.1, 'noTakeLiq': 0}}
AlgoStrategy                                                     Vwap
Status                                                      Submitted
Filled                                                              0
Remaining                                                        1000
Errors                                                            NaN
$ # check available fields
$ curl -X GET 'http://houston/blotter/orders.csv?fields=?'
{"status": "error", "msg": "unknown order status fields: ? (available fields are: Account, Action, AdjustableTrailingUnit, AdjustedStopLimitPrice, AdjustedStopPrice, AdjustedTrailingAmount,AlgoId, AlgoStrategy, AllOrNone, AuxPrice, BlockOrder, ClientId, ConId, DiscretionaryAmt, DisplaySize, Errors, Exchange, FaGroup, FaMethod, FaPercentage, FaProfile, Filled, GoodAfterTime, GoodTillDate, Hidden, LmtPrice, LmtPriceOffset, MinQty, NotHeld, OcaGroup, OcaType, OpenClose, OrderDetailsJson, OrderId, OrderNum, OrderRef, OrderType, Origin, OutsideRth, ParentId, PercentOffset, PermId, Remaining, Status, Submitted, SweepToFill, Tif, TotalQuantity, TrailStopPrice, TrailingPercent, Transmit, TriggerMethod, TriggerPrice, WhatIf"}
$ # Look at the AlgoParams field on a Vwap order; it doesn't have its own
$ # field so it's stored in OrderDetailsJson
$  curl -X GET 'http://houston/blotter/orders.json?order_ids=6001%3A65&fields=AlgoStrategy&fields=OrderDetailsJson' | json2yaml
---
  -
    OrderId: "6001:65"
    Submitted: "2018-05-18T19:02:25+00:00"
    ConId: 265598
    Action: "BUY"
    TotalQuantity: 10000
    Account: "DU123456"
    OrderRef: "my-strategy"
    OrderDetailsJson:
      AlgoParams:
        maxPctVol: 0.1
        noTakeLiq: 0
    AlgoStrategy: "Vwap"
    Status: "Submitted"
    Filled: 4000
    Remaining: 6000
    Errors: null

Possible order statuses

The IB API defines the following order statuses:

  • ApiPending - indicates order has not yet been sent to IB server, for instance if there is a delay in receiving the security definition. Uncommonly received.
  • PendingSubmit - indicates the order was sent from TWS, but confirmation has not been received that it has been received by the destination. Most commonly because exchange is closed.
  • PendingCancel - indicates that a request has been sent to cancel an order but confirmation has not been received of its cancellation.
  • PreSubmitted - indicates that a simulated order type has been accepted by the IB system and that this order has yet to be elected. The order is held in the IB system until the election criteria are met. At that time the order is transmitted to the order destination as specified.
  • Submitted - indicates that your order has been accepted at the order destination and is working.
  • ApiCancelled - after an order has been submitted and before it has been acknowledged, an API client can request its cancellation, producing this state.
  • Cancelled - indicates that the balance of your order has been confirmed cancelled by the IB system. This could occur unexpectedly when IB or the destination has rejected your order.
  • Filled - indicates that the order has been completely filled.
  • Inactive - indicates an order is not working, possible reasons include:
    • it is invalid or triggered an error. A corresponding error code is expected.
    • the order is to short shares but the order is being held while shares are being located.
    • an order is placed manually in TWS while the exchange is closed.
    • an order is blocked by TWS due to a precautionary setting and appears there in an untransmitted state
  • Error - this order status is provided by QuantRocket for orders that are immediately rejected by IB's system and thus never receive an order status from IB

Order errors and rejections

Your order might be rejected by the blotter or (more commonly) by IB or the exchange. The blotter performs basic validation of your orders such as making sure required fields are present:

$ quantrocket blotter order -p 'ConId:269745169' 'Action:BUY' 'OrderType:MKT' 'Tif:DAY' 'TotalQuantity:1'
msg: 'missing required fields OrderRef,Exchange,Account for order: {"ConId": "269745169",
 "Action": "BUY", "OrderType": "MKT", "Tif": "DAY", "TotalQuantity": "1"}'
status: error

If the blotter rejects your orders, as indicated by an error message being returned, this means the whole batch of orders was rejected. In other words, either all of the orders are submitted to IB, or none are.

In contrast, if the batch of orders is submitted to IB (as indicated by the blotter returning a list of order IDs), IB and/or the exchange will accept or reject each order independently. You can check the order status to see if the order was rejected or cancelled. Any error messages from IB will be provided in the Errors field. For example, if you don't have sufficient equity in your account, you might see an error like this:

$ quantrocket blotter status -d '6003:15' --json | json2yaml
---
  -
    OrderId: "6003:15"
    Submitted: "2018-02-20T16:59:40+00:00"
    ConId: 3691937
    Action: "SELL"
    TotalQuantity: 300
    Account: "DU123456"
    OrderRef: "my-strategy"
    Status: "Cancelled"
    Filled: 0
    Remaining: 300
    Errors:
      -
        ErrorCode: 202
        ErrorMsg: "Order Canceled - reason:Your order is not accepted because your Equity with Loan Value of [499521.99 USD] is insufficient to cover the Initial Margin requirement of [537520.21 USD]\n"
>>> f = io.StringIO()
>>> download_order_statuses(f, order_ids=["6003:15"])
>>> statuses = pd.read_csv(f, parse_dates=["Submitted"])
>>> statuses.to_dict(orient="records")
[{'Account': 'DU123456',
  'Action': 'SELL',
  'ConId': 3691937,
  'Errors': '[{"ErrorCode": 202, "ErrorMsg": "Order Canceled - reason:Your order is not accepted because your Equity with Loan Value of [499521.99 USD] is insufficient to cover the Initial Margin requirement of [537520.21 USD]\n"}]',
  'Filled': 0,
  'OrderId': '6003:15',
  'OrderRef': 'my-strategy',
  'Remaining': 300,
  'Status': 'Cancelled',
  'Submitted': Timestamp('2018-02-20 16:59:40'),
  'TotalQuantity': 300}]
$ curl -X GET 'http://houston/blotter/orders.json?order_ids=6003%3A15' | json2yaml
---
  -
    OrderId: "6003:15"
    Submitted: "2018-02-20T16:59:40+00:00"
    ConId: 3691937
    Action: "SELL"
    TotalQuantity: 300
    Account: "DU123456"
    OrderRef: "my-strategy"
    Status: "Cancelled"
    Filled: 0
    Remaining: 300
    Errors:
      -
        ErrorCode: 202
        ErrorMsg: "Order Canceled - reason:Your order is not accepted because your Equity with Loan Value of [499521.99 USD] is insufficient to cover the Initial Margin requirement of [537520.21 USD]\n"

Error messages don't always mean the order was rejected or cancelled. Some errors are more like informational warnings (for example, error 404 when shares aren't available for shorting: "Order held while securities are located"). Always check the specific error message and accompanying order status. You can look up the error code in IB's API documentation to get more information about the error, or open a support ticket with IB customer service.

One error that bears special mention because it is potentially confusing is error code 200: "No security definition has been found for the request." Normally, this error occurs when a security has been delisted and is no longer available in IB's database. However, in the context of order statuses, you can receive error code 200 for a valid conid if you try to route the order to an invalid exchange for the security:

$ # try to buy AAPL stock on GLOBEX, where it doesn't trade
$ quantrocket blotter order -p 'ConId:265598' 'Action:BUY' 'OrderType:MKT' 'Exchange:GLOBEX' 'Tif:DAY' 'OrderRef:my-strategy' 'TotalQuantity:100'
6001:66
$ quantrocket blotter status -d '6001:66' --json | json2yaml
---
  -
    OrderId: "6001:66"
    Submitted: "2018-05-18T20:37:25+00:00"
    ConId: 265598
    Action: "BUY"
    TotalQuantity: 100
    Account: "DU123456"
    OrderRef: "my-strategy"
    Status: "Error"
    Filled: 0
    Remaining: 100
    Errors:
      -
        ErrorCode: 200
        ErrorMsg: "No security definition has been found for the request"

Cancel orders

You can cancel orders by order ID, account, conid, or order ref. For example, cancel all open orders for a particular order ref:

$ quantrocket blotter cancel --order-refs 'my-strategy'
order_ids:
- 6001:62
- 6001:65
status: the orders will be canceled asynchronously
>>> from quantrocket.blotter import cancel_orders
>>> cancel_orders(order_refs=["my-strategy"])
{'order_ids': ['6001:62', '6001:65'],
 'status': 'the orders will be canceled asynchronously'}
$ curl -X DELETE 'http://houston/blotter/orders?order_refs=my-strategy'
{"order_ids": ["6001:62", "6001:65"], "status": "the orders will be canceled asynchronously"}
Or cancel all open orders:
$ quantrocket blotter cancel --all
order_ids:
- 6001:66
- 6001:67
- 6001:70
status: the orders will be canceled asynchronously
>>> from quantrocket.blotter import cancel_orders
>>> cancel_orders(cancel_all=True)
{'order_ids': ['6001:66', '6001:67', '6001:70'],
 'status': 'the orders will be canceled asynchronously'}
$ curl -X DELETE 'http://houston/blotter/orders?cancel_all=true'
{"order_ids": ["6001:66", "6001:67", "6001:70"], "status": "the orders will be canceled asynchronously"}
Canceling an order submits the cancellation request to IB. To verify that the orders were actually cancelled, check the order status:
$ quantrocket blotter status -d '6001:62' --json | json2yaml
---
  -
    OrderId: "6001:62"
    Submitted: "2018-05-18T18:33:08+00:00"
    ConId: 265598
    Action: "BUY"
    TotalQuantity: 100
    Account: "DU12345"
    OrderRef: "my-strategy"
    Status: "Cancelled"
    Filled: 0
    Remaining: 100
    Errors:
      -
        ErrorCode: 202
        ErrorMsg: "Order Canceled - reason:"
>>> f = io.StringIO()
>>> download_order_statuses(f, order_ids=["6001:62"])
>>> statuses = pd.read_csv(f, parse_dates=["Submitted"])
>>> statuses.to_dict(orient="records")
[{'Account': 'DU12345',
  'Action': 'BUY',
  'ConId': 265598,
  'Errors': '[{"ErrorCode": 202, "ErrorMsg": "Order Canceled - reason:"}]',
  'Filled': 0,
  'OrderId': '6001:62',
  'OrderRef': 'my-strategy',
  'Remaining': 100,
  'Status': 'Cancelled',
  'Submitted': Timestamp('2018-05-18 18:33:08'),
  'TotalQuantity': 100}]
$ curl -X GET 'http://houston/blotter/orders.json?order_ids=6001%3A64' | json2yaml
---
  -
    OrderId: "6001:62"
    Submitted: "2018-05-18T18:33:08+00:00"
    ConId: 265598
    Action: "BUY"
    TotalQuantity: 100
    Account: "DU12345"
    OrderRef: "my-strategy"
    Status: "Cancelled"
    Filled: 0
    Remaining: 100
    Errors:
      -
        ErrorCode: 202
        ErrorMsg: "Order Canceled - reason:"

Track positions

The blotter tracks your positions by account, conid, and order ref:

$ quantrocket blotter positions | csvlook -I
| Account  | OrderRef         | ConId     | Quantity |
| -------- | ---------------- | --------- | -------- |
| DU123456 | dma-tech         | 265598    | 541      |
| DU123456 | dma-tech         | 3691937   | 108      |
| DU123456 | my-strategy      | 265598    | 200      |
| U1234567 | es-fut-daytrade  | 269745169 | -1       |
| U1234567 | my-strategy      | 265598    | -100     |
>>> from quantrocket.blotter import download_positions
>>> import io
>>> f = io.StringIO()
>>> download_positions(f)
>>> positions = pd.read_csv(f)
>>> positions.head()
    Account          OrderRef      ConId  Quantity
0  DU123456          dma-tech     265598       541
1  DU123456          dma-tech    3691937       108
2  DU123456       my-strategy     265598       200
3  U1234567   es-fut-daytrade  269745169        -1
4  U1234567       my-strategy     265598      -100
$ curl -X GET 'http://houston/blotter/positions.csv' | csvlook -I
| Account  | OrderRef         | ConId     | Quantity |
| -------- | ---------------- | --------- | -------- |
| DU123456 | dma-tech         | 265598    | 541      |
| DU123456 | dma-tech         | 3691937   | 108      |
| DU123456 | my-strategy      | 265598    | 200      |
| U1234567 | es-fut-daytrade  | 269745169 | -1       |
| U1234567 | my-strategy      | 265598    | -100     |

The blotter tracks positions by order ref so that multiple trading strategies can trade the same security and independently manage their positions. (Moonshot uses the blotter to take account of existing positions when generating orders.) IB does not track or report positions by order ref (only by account and conid), so the blotter tracks positions independently by monitoring trade executions.

For casual viewing of your portfolio where segregation by order ref isn't required, you may find the account portfolio endpoint more convenient than using the blotter. The account portfolio endpoint provides a basic snapshot of what is visible in TWS, including descriptive labels for your positions (the blotter shows conids only), realized and unrealized PNL, and several other fields.

Close positions

You can use the blotter to generate a CSV of orders to close existing positions by account, conid, and/or order ref. Suppose you hold the following positions for a particular strategy:

$ quantrocket blotter positions --order-refs 'dma-tech' | csvlook -I
| Account  | OrderRef | ConId   | Quantity |
| -------- | -------- | ------- | -------- |
| DU123456 | dma-tech | 265598  | 1001     |
| DU123456 | dma-tech | 3691937 | -108     |
>>> f = io.StringIO()
>>> download_positions(f, order_refs=["dma-tech"])
>>> positions = pd.read_csv(f)
>>> positions.head()
    Account          OrderRef      ConId  Quantity
0  DU123456          dma-tech     265598      1001
1  DU123456          dma-tech    3691937      -108
$ curl -X GET 'http://houston/blotter/positions.csv?order_refs=dma-tech' | csvlook -I
| Account  | OrderRef | ConId   | Quantity |
| -------- | -------- | ------- | -------- |
| DU123456 | dma-tech | 265598  | 1001     |
| DU123456 | dma-tech | 3691937 | -108     |
To faciliate closing the positions, the blotter can generate a similar CSV output with the addition of an Action column set to "BUY" or "SELL" as needed to flatten the positions. You can specify additional order parameters to be appended to the CSV. In this example, we create SMART-routed market orders:
$ quantrocket blotter close --order-refs 'dma-tech' --params 'OrderType:MKT' 'Tif:Day' 'Exchange:SMART' | csvlook -I
| Account  | OrderRef | ConId   | TotalQuantity | Action | OrderType | Tif | Exchange |
| -------- | -------- | ------- | ------------- | ------ | --------- | --- | -------- |
| DU123456 | dma-tech | 265598  | 1001          | SELL   | MKT       | Day | SMART    |
| DU123456 | dma-tech | 3691937 | 108           | BUY    | MKT       | Day | SMART    |
>>> from quantrocket.blotter import close_positions
>>> import io
>>> f = io.StringIO()
>>> close_positions(f, order_refs=["dma-tech"], params={"OrderType":"MKT", "Tif":"Day", "Exchange":"SMART"})
>>> orders = pd.read_csv(f)
>>> orders.head()
    Account  OrderRef    ConId  TotalQuantity Action OrderType  Tif Exchange
0  DU123456  dma-tech   265598           1001   SELL       MKT  Day    SMART
1  DU123456  dma-tech  3691937            108    BUY       MKT  Day    SMART
$ curl -X DELETE 'http://houston/blotter/positions.csv?order_refs=dma-tech&params=OrderType%3AMKT&params=Tif%3ADay&params=Exchange%3ASMART' | csvlook -I
| Account  | OrderRef | ConId   | TotalQuantity | Action | OrderType | Tif | Exchange |
| -------- | -------- | ------- | ------------- | ------ | --------- | --- | -------- |
| DU123456 | dma-tech | 265598  | 1001          | SELL   | MKT       | Day | SMART    |
| DU123456 | dma-tech | 3691937 | 108           |  BUY   | MKT       | Day | SMART    |

Using the CLI, you can pipe the resulting orders CSV to the blotter to be placed:

$ quantrocket blotter close --order-refs 'dma-tech' --params 'OrderType:MKT' 'Tif:Day' 'Exchange:SMART' | quantrocket blotter order -f '-'
6001:79
6001:80

Any order parameters you specify using --params are applied to each order in the file. To set parameters that vary per order (such as limit prices), save the CSV to file, edit it, then submit the orders:

$ quantrocket blotter close --order-refs 'dma-tech' --params 'OrderType:LMT' 'LmtPrice:0' 'Exchange:SMART' -o orders.csv
$ # edit orders.csv, then:
$ quantrocket blotter order -f orders.csv

Close positions from TWS

If you prefer, you can close a position manually from within Trader Workstation. If you do so, make sure to enable the Order Ref field in TWS (field location varies by TWS screen and configuration) and set the appropriate order ref so that the blotter can associate the trade execution with the correct strategy:

TWS order ref

Performance Tracking

Tracking the performance of your trading strategies after they go live is just as important as backtesting them before they go live. As D.E. Shaw once said, "Analyzing the results of live trading taught us things that couldn't be learned by studying historical data." QuantRocket saves all of your trade executions to the blotter database and makes it easy to analyze your live performance. You can plot your PNL by strategy and account using Moonchart and analyze your results in pandas.

PNL

Once you've accumulated some live trading results, you can query your PNL from the blotter, optionally filtering by account, order ref (=strategy code), conid, or date range. Moonchart, the library used for Moonshot backtest visualizations, is also designed to support live trading visualization:

$ quantrocket blotter pnl --order-refs 'japan-overnight' 'canada-energy' 'midcap-earnings' 't3-nyse' --pdf -o pnl_tearsheet.pdf
>>> from quantrocket.blotter import download_pnl
>>> import moonchart
>>> download_pnl("pnl.csv", order_refs=["japan-overnight", "canada-energy", "midcap-earnings", "t3-nyse"])
>>> t = moonchart.Tearsheet()
>>> t.from_pnl_csv("pnl.csv")
$ curl -X GET 'http://houston/blotter/pnl.csv?order_refs=japan-overnight&order_refs=canada-energy&order_refs=midcap-earnings&order_refs=t3-nyse&pdf=true' > pnl_tearsheet.pdf

The performance plots will look similar to those you get for a Moonshot backtest, plus a few additional PNL-specific plots:

PNL tearsheet

The blotter can return a CSV of PNL results, or a PDF tear sheet created from the CSV. The CSV output can be loaded into a DataFrame:

>>> import pandas as pd
>>> from quantrocket.blotter import download_pnl
>>> download_pnl("pnl.csv")
>>> results = pd.read_csv("pnl.csv", parse_dates=["Date"], index_col=["Field","Date"])
>>> results.tail()
                           japan-overnight canada-energy midcap-earnings t3-nyse
Field  Date
Return 2016-12-26 23:59:59             0.0           0.0             0.0     0.0
       2016-12-27 23:59:59       -2.58e-05           0.0             0.0     0.0
       2016-12-28 23:59:59      -0.0014064   -0.00399822             0.0     0.0
       2016-12-29 23:59:59     -0.00097874   -0.00118507             0.0     0.0
       2016-12-30 23:59:59     -0.00097907   -0.00263642             0.0     0.0

Similar to a Moonshot backtest, the DataFrame consists of several stacked DataFrames, one DataFrame per field (see PNL field reference). Use .loc to isolate a particular field:

>>> pnl = results.loc["Pnl"]
>>> pnl.tail()
                    japan-overnight canada-energy midcap-earnings t3-nyse
Date
2016-12-26 23:59:59             0.0           0.0             0.0     0.0
2016-12-27 23:59:59        -10.6375           0.0             0.0     0.0
2016-12-28 23:59:59       -580.4926    -1650.2522             0.0     0.0
2016-12-29 23:59:59       -400.7061     -485.1809             0.0     0.0
2016-12-30 23:59:59       -402.0676      -1082.67             0.0     0.0
PNL is reported in your account's base currency. QuantRocket's blotter takes care of converting trades denominated in foreign currencies.

PNL field reference

PNL result CSVs contain the following fields in a stacked format. Each field is a DataFrame:

  • Pnl: the daily PNL after commissions, expressed in the base currency
  • CommissionAmount: the daily commissions paid, expressed in the base currency
  • Commission: the commissions expressed as a decimal percentage of the net liquidation value
  • NetLiquidation: the net liquidation value (account balance) for the account, as stored in the account database
  • Return: the daily PNL (after commissions) expressed as a decimal percentage of the net liquidation value
  • NetExposure: the net long or short positions as of the PNL snapshot time, expressed as a decimal percentage of the net liquidation value
  • AbsExposure: the absolute value of positions as of the PNL snapshot time, irrespective of their side (long or short). Expressed as a decimal percentage of the net liquidation value. This represents the total market exposure of the strategy.
  • OrderRef: the order ref (= strategy code)
  • Account: the account number

The CSV/DataFrame column names—and the resulting series names in tear sheet plots—depend on how many accounts and order refs are included in the query results. For PNL results using --details/details=True, there is a column per security. For non-detailed, multi-strategy, or multi-account PNL results, there is a column per strategy per account, with each column containing the aggregated (summed) results of all component securities for that strategy and account. The table below provides a summary:

If PNL query results are for...column names will be...
one account, multiple order refsorder refs
one order ref, multiple accountsaccounts
multiple accounts, multiple orders refs<OrderRef> - <Account>
one account, one order ref, and --details/details=True is specifiedsecurities (conids)

PNL snapshot time

The trade execution records from which PNL is calculated are timestamped to 1-second resolution; however, for simplicity, the PNL results returned by the blotter are aggregated to 1-day resolution. You can control the time which is used as the boundary between days, i.e. the PNL snapshot time. By default, daily PNL is calculated as of 23:59:59 UTC, but you can specify a different time (and timezone) if you prefer:

$ # get a snapshot of daily PNL as of shortly after the US market close
$ quantrocket blotter pnl --time '16:30:00 America/New_York' -o pnl.csv
>>> from quantrocket.blotter import download_pnl
>>> # get a snapshot of daily PNL as of shortly after the US market close
>>> download_pnl("pnl.csv", time="16:30:00 America/New_York")
$ # get a snapshot of daily PNL as of shortly after the US market close
$ curl -X GET 'http://houston/blotter/pnl.csv?time=16%3A30%3A00+America%2FNew_York' > pnl.csv

When you specify a time, the PNL results will still be timestamped in UTC, but the UTC time will correspond to the time you requested:

$ # 20:30:00 UTC = 16:30:00 America/New_York
$ head pnl.csv | csvlook -I
| Field       | Date                | ...
| ----------- | ------------------- |
| AbsExposure | 2016-01-04 20:30:00 |

The snapshot time can impact your PNL results in two main ways:

  • PNL results are based on realized PNL. Therefore, a position closed prior to the snapshot time will be reflected in that day's PNL. A position closed after the snapshot time will be reflected in the next day's PNL.
  • Whereas the Pnl, CommissionAmount, Commission, and Return fields are daily aggregations, the NetExposure and AbsExposure fields are not daily aggregations but snapshots of exposure (position size) as of the snapshot time. Therefore, a position closed prior to the snapshot time will result in a NetExposure and AbsExposure of 0, even though the Pnl and other aggregated fields may be nonzero. This is technically correct since no position was held as of the snapshot time, but it can be confusing since it gives the appearance of earning profit or loss despite holding no position. Also, it can cause the Normalized CAGR (CAGR/Exposure) statistic in Moonchart to display inf (infinity) due to division by zero.

How PNL is calculated

PNL is calculated from trade execution records received from IB and saved to the blotter database. The calculation (in simplified form) works as follows:

  • for each execution, calculate the proceeds (price X quantity bought or sold). For sales, the proceeds are positive; for purchases, the proceeds are negative (referred to as the cost basis).
  • for each security (segregated per account and order ref), calculate the cumulative proceeds over time as shares/contracts are bought and sold.
  • Likewise, calculate the cumulative quantity/position size over time as shares are bought and sold.
  • The cumulative PNL (before commissions) is equal to the cumulative proceeds, but only when the cumulative quantity is zero, i.e. when the position has been closed. (When the quantity is nonzero, i.e. a position is open, the cumulative proceeds reflect a temporary credit or debit that will be offset when the position is closed. Thus cumulative proceeds do not represent PNL when there is an open position.)

The following example illustrates the calculation:

ActionProceedsCumulative proceedsCumulative quantityCumulative PNL
BUY 200 shares of AAPL at $100-$20,000-$20,000200
SELL 100 shares of AAPL at $105$10,500-$9,500100
SELL 100 shares of AAPL at $110$11,000$1,5000$1,500
SELL 100 shares of AAPL at $115$11,500$13,000-100
BUY 100 shares of AAPL at $120-$12,000$1,0000$1,000

Accurate PNL calculation requires the blotter to have a complete history of trade executions. If executions are missing, not only will those trades not be reflected in the PNL but the cumulative quantities will be wrong, impacting the entire calculation. See the next section for best practices to ensure a complete history.

You may notice that PNL queries run faster the second time than the first time. The first time a PNL query runs, the blotter queries the entire execution history, calculates PNL, and caches the results in the blotter database. Subsequently, the cached results are returned, resulting in a speedup. The next time a new execution occurs for a particular account and order ref, the cached results for that account and order ref are deleted, forcing the blotter to recalculate PNL from the raw execution history the next time a PNL query is run.

Execution tracking best practices

Accurate PNL calculation requires the blotter to have a complete history of trade executions.

Whenever the blotter is connected to IB Gateway, it retrieves all available executions from IB every minute or so. The IB API makes available the current day's executions; more specifically, it makes available all executions which have occurred since the most recent IB server restart, the timing of which depends on the customer's location .

Consequently, to ensure the blotter has a complete execution history, the blotter must be connected to IB Gateway at least once after all executions for the day have finished and before the daily IB server restart. Executions could be missed under the following sequence of events:

  1. you place a non-marketable or held order
  2. you stop the IB Gateway service; thus the blotter is no longer receiving execution notifications from IB
  3. the order is subsequently filled
  4. you don't restart IB Gateway until after the next IB server restart, at which time the missed execution is no longer available from the IB API

A good rule of thumb is, if you have working orders, try to keep IB Gateway running so the blotter can be notified of executions. If you need to stop IB Gateway while there are working orders, make sure to restart it at least once before the end of the day.

PNL caveats

Be aware of the following current limitations of PNL calculation:

  • At present, positions are only priced when there is an execution; they are not marked-to-market on a daily basis. Thus, only realized PNL is reflected; unrealized PNL/open positions are not reflected.
  • Due to positions not being marked-to-market, performance plots for multi-day positions may appear jumpy, that is, have flat lines for the duration of the position followed by a large jump in PNL when the position is closed. This jumpiness can affect the Sharpe ratio compared to what it would be if the positions were marked-to-market. The more frequently your strategy trades, the less this will be an issue.
  • At present, dividends (received or debited) are not reflected in PNL.
  • Margin interest and other fees are not reflected in PNL.
  • At present, stock splits on existing positions are not accounted for by the blotter. Consequently PNL calculations will be wrong for positions that undergo splits, since the opening and closing quantities will not match.
  • IB commissions for FX trades are denominated in USD rather than in the base currency or trade currency. The blotter handles FX commissions correctly for accounts with USD base currency, but not for accounts with non-USD base currencies. This defect will be remedied in a future release.

Executions

You can download and review the "raw" execution records from the blotter rather than the calculated PNL, optionally filtering by account, order ref, conid, or date range:

$ quantrocket blotter executions -s '2018-03-01' --order-refs 'dma-tech' -o executions.csv
>>> from quantrocket.pnl import download_executions
>>> download_executions("executions.csv", start_date="2018-03-01", order_refs=["dma-tech"])
$ curl -X GET 'http://houston/blotter/executions.csv?order_refs=dma-tech&start_date=2018-03-01' > executions.csv

Execution records contain a combination of fields provided directly by the IB API and QuantRocket-provided fields related to currency conversions. An example execution is shown below:

$ head executions.csv | csvjson | json2yaml
  -
    ExecId: "00018037.55555555.01.01"
    OrderId: "6001:55"
    Account: "DU123456"
    OrderRef: "dma-tech"
    ConId: "265598"
    Time: "2018-05-18 14:01:36"
    Exchange: "BEX"
    Price: "186.84"
    Side: "BOT"
    Quantity: "100"
    Commission: "0.360257"
    Liquidation: "0"
    LastLiquidity: "2"
    Symbol: "AAPL"
    PrimaryExchange: "NASDAQ"
    Currency: "USD"
    SecType: "STK"
    Multiplier: "1.0"
    PriceMagnifier: "1"
    LastTradeDate: null
    Strike: "0.0"
    Right: null
    NetLiquidation: "1008491.11"
    BaseCurrency: "USD"
    Rate: "1"
    GrossProceeds: "-18684.0"
    Proceeds: "-18684.360257"
    ProceedsInBaseCurrency: "-18684.360257"
    CommissionInBaseCurrency: "0.360257"

Logging

Stream logs in real-time

You can stream your logs, tail -f style, from flightlog:

$ quantrocket flightlog stream
2017-01-18 10:19:31 quantrocket.flightlog: INFO Detected a change in flightlog configs directory, reloading configs...
2017-01-18 10:19:31 quantrocket.flightlog: INFO Successfully loaded config
2017-01-18 14:25:57 quantrocket.master: INFO Requesting contract details for error 200 symbols

Flightlog provides application-level monitoring of the sort you will typically want to keep an eye on. For more verbose, low-level system logging which may be useful for troubleshooting, you can stream logs from the logspout service:

$ quantrocket flightlog stream --detail
quantrocket_houston_1|172.18.0.22 - - [29/May/2018:20:21:45 +0000] GET /launchpad/gateways?status=running HTTP/1.1 200 3 - python-requests/2.14.2
quantrocket_blotter_1|[spooler /var/tmp/uwsgi/spool pid: 12] managing request uwsgi_spoolfile_on_2f55d7838d0f_5_2_1972169246_1526920303_414476 ...
quantrocket_account_1|waiting \until ECB\'s next expected 4PM CET update to collect exchange rates
quantrocket_houston_1|172.18.0.18 - - [29/May/2018:20:21:52 +0000] GET /ibg5/gateway HTTP/1.1 200 22 - python-requests/2.14.2

The introductory tutorial describes a useful technique of docking terminals in JupyterLab for the purpose of log monitoring.

Filtering logs

New in quantrocket/jupyter:1.2.3

The detailed logs can be noisy, and sometimes you may want to filter out some of the noise. You can use standard Unix grep for this purpose. For example:

$ # show only the log output of Moonshot
$ quantrocket flightlog stream --detail | grep 'moonshot'

Or use grep -v to exclude log output:

$ # ignore blotter output
$ quantrocket flightlog stream --detail | grep -v 'blotter'

Download log files

In addition to streaming your logs, you can also download log files, which contain up to 7 days of log history. You can download the application logs:

$ quantrocket flightlog get /path/to/localdir/app.log

Or you can download the more verbose system logs:

$ quantrocket flightlog get --detail /path/to/localdir/system.log

Papertrail integration

Papertrail is a log management service that lets you monitor logs from a web interface, flexibly search the logs, and send alerts to other services (email, Slack, PagerDuty, webhooks, etc.) based on log message criteria. You can configure flightlog to send your logs to your Papertrail account.

To get started, sign up for a Papertrail account (free plan available).

In Papertrail, locate your Papertrail host and port number (Settings > Log Destinations).

Use the QuantRocket configuration wizard to enter your Papertrail configuration:

Papertrail configuration wizard

Copy the block of YAML for the flightlog service from the configuration wizard and paste it into your Docker Compose or Stack file, replacing the existing flightlog YAML block. An example YAML block is shown below:

flightlog:
  image: 'quantrocket/flightlog:1.1.0'
  volumes:
    - /var/log/flightlog
  environment:
    PAPERTRAIL_HOST: logsX.papertrailapp.com
    PAPERTRAIL_PORT: 'XXXXX'
    PAPERTRAIL_LOGLEVEL: DEBUG

Redeploy the flightlog service. For local deployments:

$ cd /path/to/docker-compose.yml
$ docker-compose -p quantrocket up -d flightlog

You can log a message from the CLI to test your Flightlog configuration:

$ quantrocket flightlog log "this is a test" --name myapp --level INFO

Your message should show up in Papertrail:

Papertrail log message

You can set up alerts in Papertrail based on specific log criteria. For example, you could configure Papertrail to email you whenever new ERROR-level log messages arrive.

Send log messages

You can use the Python client to log to Flightlog from your own code:

import logging
from quantrocket.flightlog import FlightlogHandler

logger = logging.getLogger('myapp')
logger.setLevel(logging.DEBUG)
handler = FlightlogHandler()
logger.addHandler(handler)

logger.info('my app just opened a position')

You can also log directly from the CLI (this is a good way to test your Flightlog configuration):

$ quantrocket flightlog log "this is a test" --name myapp --level INFO

If you're streaming your logs, you should see your message show up:

Output
2018-02-21 10:59:01 myapp: INFO this is a test

Log command output

The CLI can accept a log message over stdin, which is useful for piping in the output of another command. In the example below, we check our balance with the --below option to only show account balance info if the cushion has dropped too low. If the cushion is safe, the first command produces no output and nothing is logged. If the cushion is too low, the output is logged to flightlog at a CRITICAL level:

$ quantrocket account balance --latest --below 'Cushion:0.02' --fields 'NetLiquidation' 'Cushion' | quantrocket flightlog log --name 'quantrocket.account' --level 'CRITICAL'

If you've set up Twilio alerts for CRITICAL messages, you can add this command to the crontab on one of your countdown services, and you'll get a text message whenever there's trouble.

Database Management

Amazon S3 backup and restore

QuantRocket can backup your databases to Amazon S3 (Amazon account required). Provide your AWS credentials to the db service as environment variables (see the configuration wizard for guidance).

S3 backup

You can backup all databases for all services using the "all" keyword:

$ quantrocket db s3push 'all'
status: the databases will be pushed to S3 asynchronously

Or all databases for a particular service:

$ quantrocket db s3push 'history'
status: the databases will be pushed to S3 asynchronously

Or particular databases for a particular service. For example, this command would push a database called quantrocket.history.nyse.sqlite:

$ quantrocket db s3push 'history' 'nyse'
status: the databases will be pushed to S3 asynchronously

If the same database already exists in S3, it will be overwritten by the new version of the database. If you wish to keep multiple versions, you can enable versioning on your S3 bucket.

You can use your crontab to automate the backup process. It's also good to optimize your databases periodically, preferably when nothing else is using them. Optimizing "vacuums" your database files, which defragments them and frees unused disk space. For example:

# Optimize and backup databases on the weekend
0 1 * * sat quantrocket db optimize 'all'
0 7 * * sat quantrocket db s3push 'all'

S3 restore

You can restore backups from S3 to your QuantRocket deployment. You can restore all databases for all services:

$ quantrocket db s3pull 'all'
status: the databases will be pulled from S3 asynchronously

All databases for a particular service:

$ quantrocket db s3pull 'history'
status: the databases will be pulled from S3 asynchronously

Or particular databases for a particular service. For example, this command would pull databases called quantrocket.history.usa-stk-1d.sqlite and quantrocket.history.japan-stk-1d.sqlite:

$ quantrocket db s3pull 'history' 'usa-stk-1d' 'japan-stk-1d'
status: the databases will be pulled from S3 asynchronously

Selective database restore can be used to facilitate multi-user deployments.

Automatic S3 restore on service launch

If you provide AWS credentials and an S3 bucket in your Docker Compose file, any QuantRocket databases existing in the S3 bucket will automatically be pulled when the db service launches (the equivalent of running quantrocket db s3pull all). This allows you to redeploy QuantRocket with databases backed up from an earlier deployment.

In some cases you might not want this behavior, for example you might want to connect your deployment to an S3 bucket but only selectively restore databases to the new deployment. If so, you can disable automatic restore by setting the environment variable S3PULL_ON_LAUNCH to false in your Compose file:

db:
    image: quantrocket/db:latest
    environment:
        AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID}
        AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY}
        S3_BUCKET: mybucket
        S3PULL_ON_LAUNCH: 'false' # defaults to true

Local backup and restore

Databases are stored inside a Docker volume, a special Docker-managed area of the filesystem. On Windows and MacOS, Docker runs inside a virtual machine, so Docker volumes are located on the filesystem of the virtual machine, not the host filesystem. On Linux, volumes are located on the host filesystem.

If the Docker application on Windows or MacOS becomes unable to start for any reason, it can be cumbersome to recover the data from the virtual machine disk image. Therefore it's a good idea to backup the data periodically to a more accessible filesystem.

The easiest way to export all of your databases (for example to an external drive) is to use Docker's cp command to copy the entire database directory to your host machine:

$ # syntax is: docker cp container:/path/to/copy/from /host/path/to/copy/to
$ docker cp quantrocket_db_1:/var/lib/quantrocket path/to/storage/exported_quantrocket_dbs

This also works for backing up your code:

$ docker cp quantrocket_codeload_1:/codeload path/to/storage/exported_quantrocket_codeload

To later restore your data into a new deployment, again use docker cp:

$ # syntax is: docker cp /host/path/to/copy/from/. container:/path/to/copy/to/
$ docker cp path/to/storage/exported_quantrocket_dbs/. quantrocket_db_1:/var/lib/quantrocket/
$ docker cp path/to/storage/exported_quantrocket_codeload/. quantrocket_codeload_1:/codeload/
Carefully note the syntax of the restore commands to avoid unexpected results such as inserting an extra subdirectory in destination path. There is a dot (.) at the end of source directory path (path/to/storage/exported_quantrocket_dbs/.), indicating that the directory contents should be copied but not the directory itself. There is a slash at the end of the destination path (quantrocket_db_1:/var/lib/quantrocket/), indicating that the files should be placed directly under that directory.

Working directly with databases

QuantRocket uses SQLite as its database backend. SQLite is fast, simple, and reliable. SQLite databases are ordinary disk files, making them easy to copy, move, and work with. If you want to run SQL queries directly against your databases, you can use the sqlite3 command line tool, either within the Docker container or on a separate downloaded copy of the database.

The safest way to run SQL queries against your databases is to first copy the database. You can list the available databases, then download the one you care about:

$ # list databases
$ quantrocket db list
quantrocket.blotter.orders.sqlite
quantrocket.history.nyse-lrg.sqlite
quantrocket.history.nyse-mid.sqlite
quantrocket.history.nyse-sml.sqlite
quantrocket.master.main.sqlite
$ # download a copy of one
$ quantrocket db get quantrocket.history.nyse-lrg.sqlite /tmp/quantrocket.history.nyse-lrg.sqlite

Now you can safely use sqlite3 to explore and analyze your copy of the database.

$ sqlite3 /tmp/quantrocket.history.nyse-lrg.sqlite
sqlite>
Databases use the following naming convention: quantrocket.{service}.{code}.sqlite The {code} portion of the database name is unique for the service but not necessarily unique across all services, and is used as shorthand for specifying the database in certain parts of QuantRocket (for example, historical data collection is triggered by specifying the database code).
Alternatively, to run queries inside the Docker container, use docker exec to open a shell inside the container. You can then list the databases which are located in /var/lib/quantrocket, and open a sqlite3 shell into the database you're interested in:
$ docker exec -ti quantrocket_db_1 bash
root@71d2acb4d10c:/$ ls /var/lib/quantrocket
quantrocket.blotter.orders.sqlite
quantrocket.history.nyse-lrg.sqlite
quantrocket.history.nyse-mid.sqlite
quantrocket.history.nyse-sml.sqlite
quantrocket.master.main.sqlite
root@71d2acb4d10c:/$ sqlite3 /var/lib/quantrocket/quantrocket.history.nyse-lrg.sqlite
sqlite>
SQLite supports unlimited concurrent reads but only one writer at a time. Be careful if you choose to work directly with your QuantRocket databases, as you could break QuantRocket by doing so. You should limit yourself to SELECT queries. A safer approach is to copy the database as described above.

Disk space

You can use Docker to check on the total disk utilization of your deployment:

$ docker system df
TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              16                  16                  10.1GB              5.474GB (54%)
Containers          20                  19                  1.272GB             26.83MB (2%)
Local Volumes       5                   4                   102.5GB             0B (0%)
Build Cache         0                   0                   0B                  0B

Your databases are reflected under the heading "Local Volumes". For a more granular look, you can list your databases with details, which includes the file size:

$ quantrocket db list --detail
- last_modified: '2018-09-05T15:49:02'
  name: quantrocket.account.balance.sqlite
  path: /var/lib/quantrocket/quantrocket.account.balance.sqlite
  size_in_mb: 0.33
- last_modified: '2018-09-05T15:01:05'
  name: quantrocket.account.rates.sqlite
  path: /var/lib/quantrocket/quantrocket.account.rates.sqlite
  size_in_mb: 0.29
- last_modified: '2018-09-05T13:00:38'
  name: quantrocket.blotter.errors.sqlite
  path: /var/lib/quantrocket/quantrocket.blotter.errors.sqlite
  size_in_mb: 0.45
- last_modified: '2018-09-05T15:49:53'
  name: quantrocket.blotter.executions.sqlite
  path: /var/lib/quantrocket/quantrocket.blotter.executions.sqlite
  size_in_mb: 7.25
 ...
>>> from quantrocket.db import list_databases
>>> databases = list_databases(detail=True)
>>> databases = pd.DataFrame.from_records(databases)
>>> databases.head()
         last_modified                                   name                                               path  size_in_mb
0  2018-09-05T15:52:02     quantrocket.account.balance.sqlite  /var/lib/quantrocket/quantrocket.account.balan...        0.33
1  2018-09-05T15:01:05       quantrocket.account.rates.sqlite  /var/lib/quantrocket/quantrocket.account.rates...        0.29
2  2018-09-05T13:00:38      quantrocket.blotter.errors.sqlite  /var/lib/quantrocket/quantrocket.blotter.error...        0.45
3  2018-09-05T15:52:40  quantrocket.blotter.executions.sqlite  /var/lib/quantrocket/quantrocket.blotter.execu...        7.25
4  2018-09-05T13:00:35      quantrocket.blotter.orders.sqlite  /var/lib/quantrocket/quantrocket.blotter.order...        1.62
$ curl -X GET 'http://houston/db/databases?detail=True'
[{"name": "quantrocket.account.balance.sqlite", "path": "/var/lib/quantrocket/quantrocket.account.balance.sqlite", "size_in_mb": 0.33, "last_modified": "2018-09-05T15:51:02"}, {"name": "quantrocket.account.rates.sqlite", "path": "/var/lib/quantrocket/quantrocket.account.rates.sqlite", "size_in_mb": 0.29, "last_modified": "2018-09-05T15:01:05"}, {"name": "quantrocket.blotter.errors.sqlite", "path": "/var/lib/quantrocket/quantrocket.blotter.errors.sqlite", "size_in_mb": 0.45, "last_modified": "2018-09-05T13:00:38"}, {"name": "quantrocket.blotter.executions.sqlite", "path": "/var/lib/quantrocket/quantrocket.blotter.executions.sqlite", "size_in_mb": 7.25, "last_modified": "2018-09-05T15:51:37"},
...

Add disk space

The steps for adding more disk space depends on the host operating system.

  • On Windows and Mac, Docker runs inside a VM which is allocated a certain amount of disk space. Open the Docker settings via the system tray and find the section for increasing the allocated disk space. If you add an external hard drive you can move the VM image to that drive.
  • On Linux, Docker uses the native filesystem so there are no additional steps beyond increasing the disk space on the host OS.

Code Management

Push to Git

When starting out with QuantRocket, you can load the demo files from QuantRocket's demo repository. After editing the files or creating your own you might like to push them to your own Git repo.

To do this, first create an empty repository in your Git hosting provider (for example, GitHub or Bitbucket).

Next, open a terminal inside JupyterLab. Inside the /codeload directory, initialize a Git repository:

$ git init

Configure the email address and name to use with commits:

$ git config --global user.email "neil@example.com"
$ git config --global user.name "Neil Armstrong"

Configure the URL of your remote repository. In this example, we'll use HTTPS instead of SSH and add our GitHub or Bitbucket username to the URL so that we can use username/password authentication instead of private key authentication:

$ # GitHub example:
$ git remote add origin https://<your-username>@github.com/<your-username>/<your-repo-name>.git
$ # Bitbucket example:
$ git remote add origin https://<your-username>@bitbucket.org/<your-username>/<your-repo-name>.git

Add and commit your files:

$ git add moonshot/*.py
$ git add notebooks/*.ipynb
$ git commit -m 'adding moonshot algos and notebooks'

Finally, push your files to your Git hosting provider:

$ git push -u origin master

Deploy from Git

To deploy from your own Git repository into a brand new QuantRocket deployment, unselect the configuration wizard option to load QuantRocket's demo files. Once your deployment is running, open a terminal inside JupyterLab and clone your repository:

$ # GitHub example:
$ git clone https://<your-username>@github.com/<your-username>/<your-repo-name>.git /codeload
$ # Bitbucket example:
$ git clone https://<your-username>@bitbucket.org/<your-username>/<your-repo-name>.git /codeload

Troubleshooting

Running out of memory

Quantitative research and backtesting often requires lots of memory. Exactly how much memory depends on a variety of factors including the number of securities in your research and backtests, the depth and granularity of your data, the number of data fields, and your analytical techniques. While QuantRocket's historical and fundamental data services are designed to efficiently serve data without loading all of it into memory, in your research and backtests you will typically load large amounts of data into memory. Sooner or later, you may run out of memory.

QuantRocket makes no attempt to prevent you from loading more data than your system can handle; it tries to do whatever you ask. Luckily, because QuantRocket runs inside Docker, running out of memory won't crash your whole deployment or the host OS; Docker will in most cases simply kill the process that tried to use too much memory. By the nature of out of memory errors, QuantRocket doesn't have a chance to provide an optimal error message, so it's worth knowing what to look for.

If you run out of memory in a Jupyter notebook, Docker will kill the kernel process and you'll probably see a message like this:

Jupyter Notebooks killed process

If you run out of memory in a backtest, you'll get a 502 error referring you to flightlog, which will instruct you to add more memory or try a segmented backtest:

$ quantrocket moonshot backtest 'big-boy' --start-date '2000-01-01'
msg: 'HTTPError(''502 Server Error: Bad Gateway for url: http://houston/moonshot/backtests?strategies=big-boy&start_date=2000-01-01'',
  ''please check the logs for more details'')'
status: error
$ quantrocket flightlog stream --hist 1
2017-10-02 19:29:32 quantrocket.moonshot: ERROR the system killed the worker handling the request, likely an Out Of Memory error; if you were backtesting, try a segmented backtest to reduce memory usage (for example `segment="A"`), or add more memory

You can use Docker to check how much memory is available and how much is being used by different containers:

$ docker stats
NAME                                CPU %               MEM USAGE / LIMIT
quantrocket_moonshot_1              0.01%               58.25MiB / 7.952GiB
quantrocket_jupyter_1               0.01%               64.86MiB / 7.952GiB
quantrocket_zipline_1               0.01%               65.26MiB / 7.952GiB
quantrocket_master_1                0.02%               31.85MiB / 7.952GiB
...

Advanced Topics

Custom Docker services

If you run your own custom Docker services inside the same Docker network as QuantRocket, and those services provide an HTTP API, you can access them through houston. Assuming a custom Docker service named secretsauce listening on port 80 inside the Docker network and providing an API endpoint /secretstrategy/signals, you can access your service at:

$ curl -X GET 'http://houston/proxy/http/secretsauce/80/secretstrategy/signals'

Houston can also proxy services speaking the uWSGI protocol:

$ curl -X GET 'http://houston/proxy/uwsgi/secretsauce/80/secretstrategy/signals'

The benefit of using houston as a proxy, particularly if running QuantRocket in the cloud, is that you don't need to expose your custom service to a public port; your service is only accessible from within your trusted Docker network, and all requests from outside the network must go through houston, which you can secure with SSL and Basic Auth. The following table depicts an example configuration:

This service......exposes this port to other services in the Docker network.....and maps it to this port on the host OS.....making this service directly reachable from outside
houston443 and 80443 (80 not mapped)yes
secretsauce80not mappedno

So you would connect to houston securely on port 443 and houston would connect to secretsauce on port 80, but you would not connect directly to the secretsauce service. Your service would use EXPOSE 80 in its Dockerfile but you would not use the -p/--publish option when starting the container with docker run (or the ports key in Docker Compose).

HTTP request concurrency

The number of workers available to handle HTTP requests in a QuantRocket service is set via environment variable and can be overridden. If you have a very active deployment, you might find it beneficial to increase the number of workers (at the cost of greater resource consumption). First, check the current number of workers:

$ docker exec quantrocket_master_1 env | grep UWSGI_WORKERS
UWSGI_WORKERS=3

Override the variable by setting the desired value in your Compose file or Stack file:

# docker-compose.yml
master:
    image: 'quantrocket/master:latest'
    environment:
        UWSGI_WORKERS: 5

Then redeploy the service:

$ docker-compose -f docker-compose.yml -p quantrocket up -d master

CLI output format

By default, the command line interface (CLI) will display command results in YAML format:

$ quantrocket launchpad status
ibg1: stopped
ibg2: running
ibg3: stopped

If you prefer the output format to be JSON, set an environment variable called QUANTROCKET_CLI_OUTPUT_FORMAT:

$ export QUANTROCKET_CLI_OUTPUT_FORMAT=json
$ quantrocket launchpad status
{'ibg1': 'stopped', 'ibg2': 'running', 'ibg3': 'stopped'}