-
Notifications
You must be signed in to change notification settings - Fork 0
/
manifest.yml
66 lines (64 loc) · 2.38 KB
/
manifest.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
toolforge: 1.0
container: xrzipo81
type: tool
environment:
size: medium
parameters:
- type: string
domain:
type: pattern
pattern: .{1,80}
name: IdColumnName
description: |
The name of the column in the `SocialPosts` input to use as the post ID
in the Entities output. This column should have a unique value for each
social post. Typical choices are ID and URL values.
required: true
- type: string
domain:
type: pattern
pattern: .{1,80}
name: TextColumnName
description: |
The name of the column in the `SocialPosts` input to use as the post text
to parse. This column should contain the post "body", which therefore
contains the entities being parsed.
required: true
inputs:
- name: SocialPosts
description: |
The spreadsheet containing the social media posts to analyze. The
spreadsheet must contain at least two columns, which are named in the
above parameters `IdColumnName` and `TextColumnName`.
extensions:
- txt
- csv
- xls
- xlsx
outputs:
- name: Entities
description: |
The entity data parsed from the social media posts in the `SocialPosts`
input. If there are more than one million entities, then only the first
one million are shown. To expedite analysis, entities are lowercased,
except for entities where case is significant. The output will contain
the following columns:
* `id` -- The social post value from the `IdColumnName` column
* `type` -- The type of entity, e.g., `hashtag`, `link`, etc.
* `value` -- The value of the entity, e.g., `#helloworld`, `https://cnn.com/`, etc.
extensions:
- csv
- xlsx
- name: Counts
description: |
A frequency analysis of the entity data parsed from the social media
posts in the `SocialPosts` input. If there are more than one million
unique entities, then only the first one million are shown. To expedite
analysis, entities are lowercased, except for entities where case is
significant. The output will contain the following columns:
* `type` -- The type of entity, e.g., `hashtag`, `link`, etc.
* `value` -- The value of the entity, e.g., `#helloworld`, `https://cnn.com/`, etc.
* `count` -- The number of times the entity was found in the `SocialPosts` input.
extensions:
- csv
- xlsx