Hello community members,
Is it possible to backtest portfolio using a list of buy and sell transactions (date, symbol)?
If you can, please mention/direct me to relevant posts/resources
Thank you!
Alex
Hello community members,
Is it possible to backtest portfolio using a list of buy and sell transactions (date, symbol)?
If you can, please mention/direct me to relevant posts/resources
Thank you!
Alex
Yes, external buys and sells can be imported into a backtest (and research too) by using 'Self Serve' data. I'll walk through an example.
Start with a CSV file of transactions something like this
Transaction_Date Date_Copy Stock Qty
2020-01-03 00:00:00 2020-01-03 AAPL 100
2020-01-03 00:00:00 2020-01-03 IBM 50
2020-01-07 00:00:00 2020-01-07 AAPL 100
2020-01-09 00:00:00 2020-01-09 AAPL -200
2020-01-13 00:00:00 2020-01-13 IBM -50
That is a list of transactions with a stock ticker and qty bought or sold (negative qty values indicate a sale or a short). Load that as a Self Serve data set. There are a couple of good posts on how that works here and here. The documentation also has a section devoted to Self Serve data.
Normally, data loaded via self serve is assumed to be 'as-of' yesterday so it can then be acted on today. The data will be available one trading day after the date in the CSV file. This is what's typically desired for a trading signal or market data to ensure there is no lookahead bias. However, in this case we want the dates to represent the actual trade date with no lag. There is an option on the self serve upload screen labeled Use explicit timestamp column
to allow just this behavior. Click that and select the 'Transaction_Date' field in the CSV. That will make the data available on the date specified in the file without the one day lag. However, there are a couple of things to note. First, this date must actually be a timestamp so, in the CSV file, add the hh:mm:ss field (as above). Second, include another field for the as-of date. This isn't really used but it needs to be present for the import. Just copy the Transaction_Date but without the time portion.
That's pretty much it. With the buy and sell quantities imported as a Self Serve dataset they can be fetched via pipeline just like any other data. Use the values to place orders something like this
# Fetch the pipeline data.
pipe_data = pipeline_output('my_pipe')
for stock, qty in pipe_data.qty.items():
if data.can_trade(stock):
order(stock, qty)
Once again, there are a couple of things to note. All Self Serve data is forward filled. This means that, for example, the last transaction for -50 shares of IBM will not only show up on 2020-01-13 but also be forward filled for every day afterward until it's changed. To keep from re-ordering those shares each day one needs to check if the order qty is current. This can be done something like this in the pipeline definition.
def make_pipeline():
# Get the latest quantity of shares to order and how many days ago it was.
order_qty = my_transactions.qty.latest
days_since_update = BusinessDaysSincePreviousEvent(
inputs=[my_transactions.timestamp.latest]
)
is_current = days_since_update < 1
pipe = Pipeline(
columns={'order_qty': order_qty},
screen=is_current,
)
return pipe
The screen in the above pipeline definition will keep all the forward filled transactions from being returned when running the pipeline.
See the attached algo. It won't run 'as-is' because Self Serve datasets are only accessible by the owner (which is me). To run the algo you will need to create your own Self Serve dataset. Copy and paste the one above if desired.
If one has a buy/sell signal (say a 1 or 0) instead of the actual order shares the algo would be very similar. Instead of ordering a fixed quantity of shares, perhaps use the order_target_value
method and base the percent from the imported value.
Hope that helps and is a start.
The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.
@ Dan - Thanks for posting your solution using self-serve data.
I have been using another approach to do the same thing but without self-serve data. I like your solution because it requires less re-formatting of the trade data and I don't have to copy and paste the data into the algorithm code.
One thing however is that the solution does not include transaction prices. Alex did not ask for this, but it's something that I think generally people may be interested in.
To that end I have created a modification of your algorithm that includes price.
Transaction_Date,Date_Copy,Stock,Qty,Price
2020-01-03 00:00:00,2020-01-03,AAPL,100,296
2020-01-03 00:00:00,2020-01-03,IBM,50,134
2020-01-07 00:00:00,2020-01-07,AAPL,100,300
2020-01-09 00:00:00,2020-01-09,AAPL,-200,307.8
2020-01-13 00:00:00,2020-01-13,IBM,-50,135.5
The way that I include price is to use a custom slippage model with associated limit order. The limit price for the order is scaled to insure that the order triggers and then the price is scaled back to the original value in the custom slippage model. It gets the job done but it seems like a bit of a kludge; could you review this approach and let me know if there is a better way?