-
Notifications
You must be signed in to change notification settings - Fork 1
What Is RNA Profiling
RNA Profiling identifies and presents major substructural trends in a Boltzmann sample. Profiling also relates the major trends with each other in a summary profile graph, which further allows significant commonalities and differences to emerge.
Version 2 of RNA Profiling adds a new interactive output, a cleaner form of summary profile graph, and new descriptions of the substructural trends.
Profiling works by identifying the most common features found in the sample, both individually and in combination. In particular, equivalence classes are used to group together structures with trivial differences, and thresholds employed to eliminate low frequency elements. Employing both eliminates noise and results in highlighting major, high-frequency structural trends.
More specifically:
-
Profiling starts with a Boltzmann sample of structures. We recommend 1,000 structures for exploratory purposes and 10,000 for greater reproducibility. Longer sequences require larger samples.
-
Next, it assigns every helix present in the sample to an equivalence class known as a helix class. All of the helix classes are then sorted in descending order of frequency. A threshold is determined, either by the program by default or by the user, and all helix classes above it are chosen as selected helix classes.
-
These selected helix classes may then be grouped together into stem classes, which can have their frequency counted in either an exact or a fuzzy manner. Depending on user settings, either selected helix classes or stem classes may be the features in use for the remainder of the analysis.
-
Each structure is 'profiled' according to its inclusion of features. Thus, every profile is a subset of features, and every profile is assigned to an equivalence class denoted by its common profile. All profiles are sorted in descending order of frequency. A frequency threshold is determined, either by the program by default or by the user, and all profiles above it are chosen as selected profiles.
-
RNAprofiling v1 displayed profiles in a Hasse diagram induced by the partial ordering of set inclusion. Although v2 still supports this output, it also adds a decision tree based output, which highlights the differences between profiles.
Example output is available here