This code implements a tool to extract causal chains from text by summarizing the text using the bart-cause-effect
model from Hugging Face Transformers and then linking the causes and effects with cosine similarity calculated using the Sentence Transformer
model.
Library can be installed with pip install causal-chains
- Initialize a
CausalChain
object with a list of chunks of text as input. - Run the
create_effects
method to get the cause and effect pairs from the text and then link the events based on cosine similarity of their embeddings. - Run the
visualize
method to see the largest chain.
The class "CausalChain" has the following methods:
-
create_effects
: This method uses the pipeline from the transformers library to analyze the text for cause-effect relationships. The text is first split into chunks, with each chunk containing a maximum of 3 sentences. This is done to limit the output length and avoid memory issues. The pipeline then generates summaries of the cause-effect relationships in the text. The triggers and effects of these relationships are stored in separate lists. -
create_connections
: This method uses the SentenceTransformer from the sentence-transformers library to encode the triggers and effects generated in the create_effects method. The method then calculates the cosine similarity between the triggers and effects to determine if there is a cause-effect relationship between them. If the cosine similarity score is above a certain threshold (default is 0.6), a connection is established between the trigger and effect. The connections between triggers and effects are stored in a dictionary. -
find_biggest_chain
: This method finds the longest chain of cause-effect relationships in the text. It starts from an effect and follows the connections stored in the connections dictionary to find other effects that are connected to it. The method continues this process until there are no more connections or the chain reaches a certain length (default is 10). The longest chain found is stored in the biggest_chain attribute. -
visualize
: This method displays a given chain using pydot.
from causal_chains.CausalChain import CausalChain, util
import wikipedia
text = wikipedia.page("ChristopherColumbus").content
chunks = util.create_chunks(text)
cc = CausalChain(chunks,device=0)
cc.create_connections()
biggest_chain = cc.biggest_chain
cc.visualize(biggest_chain)
The display that this code produces is shown at the top of this page.