Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Universe data frames normalization #8385

Merged
merged 29 commits into from
Nov 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
ce720c4
Normalize universe data frames
jhonabreul Oct 21, 2024
5017fd7
Fix unit tests and algorithms to expecte new universe dataframe format
jhonabreul Oct 21, 2024
5630896
Fixes
jhonabreul Oct 21, 2024
9150a2f
Add PandasConverter.DataFrameGenerator class
jhonabreul Oct 22, 2024
e6973c5
Pandas data frame generator class fixes
jhonabreul Oct 23, 2024
71a8f27
Add comments
jhonabreul Oct 23, 2024
28395a4
Housekeeping
jhonabreul Oct 23, 2024
fc1b1dc
Add attributes to mark classes and properties for pandas processing
jhonabreul Oct 24, 2024
5d2da6a
Improve pandas properties expanding
jhonabreul Oct 24, 2024
ea01cb8
Use PandasData generalization for Lean common data types
jhonabreul Oct 25, 2024
39c93d5
Add points time as column when converting base data collections to da…
jhonabreul Oct 27, 2024
b60a284
Cleanup and minor changes
jhonabreul Oct 28, 2024
3271bc5
Minor change
jhonabreul Oct 28, 2024
beed4df
Pandas data to get type members on demand
jhonabreul Oct 28, 2024
31139df
Move Pandas helper classes to their own files
jhonabreul Oct 28, 2024
051beac
Minor changes
jhonabreul Oct 29, 2024
9c4caeb
Add flatten argument to python history api
jhonabreul Oct 31, 2024
c3e24d8
Adding missing changes to last commit
jhonabreul Nov 1, 2024
9618a24
Update Pythonnet version to 2.0.40
jhonabreul Nov 1, 2024
fd2cad4
Add flattent argument to algorithm's OptionChain api
jhonabreul Nov 4, 2024
797b4da
Minor changes
jhonabreul Nov 4, 2024
878c785
Housekeeping
jhonabreul Nov 4, 2024
374c8dd
Minor changes
jhonabreul Nov 5, 2024
616151a
Bug fix skipping data collection data points
jhonabreul Nov 5, 2024
d86ecb1
Add comment
jhonabreul Nov 5, 2024
25bcba0
Set correct exchange time to OptionUniverse instances
jhonabreul Nov 6, 2024
1b2b9be
Address peer review and cleanup
jhonabreul Nov 18, 2024
b30da4d
Cleanup
jhonabreul Nov 19, 2024
7bb82b4
Minor changes
jhonabreul Nov 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Algorithm.CSharp/QuantConnect.Algorithm.CSharp.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
<DebugType>portable</DebugType>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="QuantConnect.pythonnet" Version="2.0.39" />
<PackageReference Include="QuantConnect.pythonnet" Version="2.0.40" />
<PackageReference Include="Accord" Version="3.6.0" />
<PackageReference Include="Accord.Fuzzy" Version="3.6.0" />
<PackageReference Include="Accord.MachineLearning" Version="3.6.0" />
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
<PackageLicenseFile>LICENSE</PackageLicenseFile>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="QuantConnect.pythonnet" Version="2.0.39" />
<PackageReference Include="QuantConnect.pythonnet" Version="2.0.40" />
<PackageReference Include="Accord" Version="3.6.0" />
<PackageReference Include="Accord.Math" Version="3.6.0" />
<PackageReference Include="Accord.Statistics" Version="3.6.0" />
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,12 +125,9 @@ def initialize(self):
self.universe_settings.resolution = Resolution.HOUR
universe = self.add_universe(self.universe.etf(spy, self.universe_settings, self.filter_etf_constituents))

historical_data = self.history(universe, 1)
if len(historical_data) != 1:
raise ValueError(f"Unexpected history count {len(historical_data)}! Expected 1")
for universe_data_collection in historical_data:
if len(universe_data_collection) < 200:
raise ValueError(f"Unexpected universe DataCollection count {len(universe_data_collection)}! Expected > 200")
historical_data = self.history(universe, 1, flatten=True)
if len(historical_data) < 200:
raise ValueError(f"Unexpected universe DataCollection count {len(historical_data)}! Expected > 200")

### <summary>
### Filters ETF constituents
Expand Down
28 changes: 15 additions & 13 deletions Algorithm.Python/FundamentalRegressionAlgorithm.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def initialize(self):
raise ValueError(f"Unexpected Fundamental count {len(fundamentals)}! Expected 2")

# Request historical fundamental data for symbols
history = self.history(Fundamental, TimeSpan(2, 0, 0, 0))
history = self.history(Fundamental, timedelta(days=2))
if len(history) != 4:
raise ValueError(f"Unexpected Fundamental history count {len(history)}! Expected 4")

Expand All @@ -69,26 +69,28 @@ def initialize(self):

def assert_fundamental_universe_data(self):
# Case A
universe_data_per_time = self.history(self._universe.data_type, [self._universe.symbol], TimeSpan(2, 0, 0, 0))
if len(universe_data_per_time) != 2:
raise ValueError(f"Unexpected Fundamentals history count {len(universe_data_per_time)}! Expected 2")

for universe_data_collection in universe_data_per_time:
self.assert_fundamental_enumerator(universe_data_collection, "A")
universe_data = self.history(self._universe.data_type, [self._universe.symbol], timedelta(days=2), flatten=True)
self.assert_fundamental_history(universe_data, "A")

# Case B (sugar on A)
universe_data_per_time = self.history(self._universe, TimeSpan(2, 0, 0, 0))
if len(universe_data_per_time) != 2:
raise ValueError(f"Unexpected Fundamentals history count {len(universe_data_per_time)}! Expected 2")

for universe_data_collection in universe_data_per_time:
self.assert_fundamental_enumerator(universe_data_collection, "B")
universe_data_per_time = self.history(self._universe, timedelta(days=2), flatten=True)
self.assert_fundamental_history(universe_data_per_time, "B")

# Case C: Passing through the unvierse type and symbol
enumerable_of_data_dictionary = self.history[self._universe.data_type]([self._universe.symbol], 100)
for selection_collection_for_a_day in enumerable_of_data_dictionary:
self.assert_fundamental_enumerator(selection_collection_for_a_day[self._universe.symbol], "C")

def assert_fundamental_history(self, df, case_name):
dates = df.index.get_level_values('time').unique()
if dates.shape[0] != 2:
raise ValueError(f"Unexpected Fundamental universe dates count {dates.shape[0]}! Expected 2")

for date in dates:
sub_df = df.loc[date]
if sub_df.shape[0] < 7000:
raise ValueError(f"Unexpected historical Fundamentals data count {sub_df.shape[0]} case {case_name}! Expected > 7000")

def assert_fundamental_enumerator(self, enumerable, case_name):
data_point_count = 0
for fundamental in enumerable:
Expand Down
2 changes: 1 addition & 1 deletion Algorithm.Python/OptionChainFullDataRegressionAlgorithm.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def initialize(self):

goog = self.add_equity("GOOG").symbol

option_chain = self.option_chain(goog)
option_chain = self.option_chain(goog, flatten=True)

# Demonstration using data frame:
df = option_chain.data_frame
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ def initialize(self):
goog = self.add_equity("GOOG").symbol
spx = self.add_index("SPX").symbol

chains = self.option_chains([goog, spx])
chains = self.option_chains([goog, spx], flatten=True)

self._goog_option_contract = self.get_contract(chains, goog, timedelta(days=10))
self._spx_option_contract = self.get_contract(chains, spx, timedelta(days=60))
Expand Down
29 changes: 15 additions & 14 deletions Algorithm.Python/OptionUniverseHistoryRegressionAlgorithm.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,22 +25,23 @@ def initialize(self):

option = self.add_option("GOOG").symbol

historical_options_data_df = self.history(option, 3, Resolution.DAILY)
historical_options_data_df = self.history(option, 3, flatten=True)

if historical_options_data_df.shape[0] != 3:
raise RegressionTestException(f"Expected 3 option chains from history request, but got {historical_options_data_df.shape[0]}")
# Level 0 of the multi-index is the date, we expect 3 dates, 3 option chains
if historical_options_data_df.index.levshape[0] != 3:
raise RegressionTestException(f"Expected 3 option chains from history request, but got {historical_options_data_df.index.levshape[1]}")

for index, row in historical_options_data_df.iterrows():
data = row.data
date = index[4]
chain = list(self.option_chain_provider.get_option_contract_list(option, date))
for date in historical_options_data_df.index.levels[0]:
expected_chain = list(self.option_chain_provider.get_option_contract_list(option, date))
expected_chain_count = len(expected_chain)

if len(chain) == 0:
raise RegressionTestException(f"No options in chain on {date}")
actual_chain = historical_options_data_df.loc[date]
actual_chain_count = len(actual_chain)

if len(chain) != len(data):
raise RegressionTestException(f"Expected {len(chain)} options in chain on {date}, but got {len(data)}")
if expected_chain_count != actual_chain_count:
raise RegressionTestException(f"Expected {expected_chain_count} options in chain on {date}, but got {actual_chain_count}")

for i in range(len(chain)):
if data[i].symbol != chain[i]:
raise RegressionTestException(f"Missing option contract {chain[i]} on {date}")
for i, symbol in enumerate(actual_chain.index):
expected_symbol = expected_chain[i]
if symbol != expected_symbol:
raise RegressionTestException(f"Expected symbol {expected_symbol} at index {i} on {date}, but got {symbol}")
2 changes: 1 addition & 1 deletion Algorithm.Python/QuantConnect.Algorithm.Python.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
<Compile Include="..\Common\Properties\SharedAssemblyInfo.cs" Link="Properties\SharedAssemblyInfo.cs" />
</ItemGroup>
<ItemGroup>
<PackageReference Include="QuantConnect.pythonnet" Version="2.0.39" />
<PackageReference Include="QuantConnect.pythonnet" Version="2.0.40" />
</ItemGroup>
<ItemGroup>
<Content Include="OptionUniverseFilterGreeksShortcutsRegressionAlgorithm.py" />
Expand Down
Loading