Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Conditional to new pandas dataframe

Hi,

I am having trouble getting the below code to work. Basically, I have another dataframe (results_df) that has a number of columns, indexed by numbers (0, 1, ..., 971): y, x, pvalue, tstat, etc.

I want to test if pvalue < 0.05, add the corresponding x to a new dataframe indexed by numbers (0, 1, 2, etc). Can anybody help? Much appreciated, thanks!

i=0  
coint_tickers = pd.DataFrame()

for x in range(len(results_df)):  
    if results_df['pvalue'][i] < 0.05:  
        coint_tickers[i] = results_df['x'][i]  
        #print results_df['x'][i]  
    i=i+1  
3 responses

Your loop is assigning a value x to each number in the range, then you are asking for results_df['pvalue'][i], which uses i as the index. I would try making the x and i so there is less moving parts. If you post results_df.head() it would be easier to give you an exact answer.

coint_tickers = pd.DataFrame()

for i in range(len(results_df)):  
    if results_df['pvalue'][i] < 0.05:  
        coint_tickers[i] = results_df['x'][i]  

or try the dataframe indexing

coint_tickers = pd.DataFrame()

for i in range(len(results_df)):  
    row = results_df.ix[i]  
    if row['pvalue'] < 0.05:  
        coint_tickers[i] = results_df['x'][i]  

Pandas, like Numpy, has boolean indexing and this can be done without a loop using

coint_tickers = results_df[results_df['pvalue'] < 0.05]['x']  

Thank you for your replies.

Aidan, yours worked perfectly. Thank you!