Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pr for new vision of md and chunk #101

Closed
wants to merge 64 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
bc2727f
fix bug on to page
Catrunaround Jul 19, 2024
343234d
delet api test func
Catrunaround Jul 19, 2024
f2aef2a
fix missing pkl file bug and use md_file in to_page
Catrunaround Jul 20, 2024
1987188
use base to_page instead of subclass
Catrunaround Jul 22, 2024
b5cad4f
delet api using method
Catrunaround Jul 22, 2024
f16197c
write rst test didn't finish video tests
Catrunaround Jul 22, 2024
f01b488
mp4 test didn't finsih
Catrunaround Jul 25, 2024
0dde356
commited video debuger
Catrunaround Jul 25, 2024
102c53d
add url to page
Aug 16, 2024
a4ae6b7
merge
Aug 16, 2024
050648a
testing added url
Aug 23, 2024
22a1259
Merge branch 'main' of https://github.com/augcog/tai
Aug 28, 2024
182e689
update to page and to chunk method
Sep 3, 2024
252a531
Merge branch 'main' of https://github.com/Catrunaround/tai into yikan…
Sep 3, 2024
3816b9c
add page num in chunk
Sep 6, 2024
d953dcf
Remove large file
Sep 6, 2024
e88b894
fix local problem
Sep 6, 2024
31db238
fix page num bug now it works
Sep 9, 2024
93978b1
page num without extract file
Sep 13, 2024
aaee512
add page num to yaml file
Sep 13, 2024
c5b35ff
debug
Sep 19, 2024
b6c2729
add new yaml file to store page number and use it to track page info
Sep 23, 2024
a0a6a79
add chunk.py
Sep 23, 2024
8dc309b
fix line error
Sep 24, 2024
20b24cb
command change
Sep 25, 2024
19cd5e5
Merging main into yikang_testing2 to incorporate latest changes from …
Sep 25, 2024
b69b881
merge new pr
Sep 30, 2024
90962c2
updata torch nougat
Sep 30, 2024
e016d05
delete no image pdf and page num yaml file after convert
Oct 7, 2024
63ca9fb
fix url not available
Oct 8, 2024
1983b5e
commit change
Oct 8, 2024
e194ca0
Merge pull request #9 from Catrunaround/yikang_testing2
Catrunaround Oct 8, 2024
8a857d8
use torch for result
FranardoHuang Oct 8, 2024
13951bf
commit change
FranardoHuang Oct 8, 2024
419963b
rebase the commit
Oct 14, 2024
ccf3aaf
merge
Oct 14, 2024
61022da
resolve merge
Oct 14, 2024
6745525
add
Oct 14, 2024
4508b01
merge
Oct 14, 2024
7d57f40
use the latest conversion
Oct 15, 2024
0c5855d
added scrape pdf
Oct 15, 2024
956d30b
delete rst test file for now
Oct 15, 2024
c5edcb8
reslove merge
Nov 7, 2024
46603e6
find empty chunk
Nov 7, 2024
50a173b
merge header to subheader if applicable
Nov 8, 2024
ab7d69a
update debug-page
Nov 13, 2024
03e4330
updata yaml file for test data
FranardoHuang Nov 13, 2024
9a07a69
a
Nov 14, 2024
a095a34
Merge pull request #13 from Catrunaround/test-branch
Catrunaround Nov 14, 2024
c47cb96
added empty header dected
Nov 18, 2024
460c9da
delete page info file
Nov 19, 2024
a67cb5c
megre main
Nov 19, 2024
c0858e3
commit change
Nov 25, 2024
e654d74
merge new
Nov 25, 2024
5fa1b5c
fix page num = none
Nov 26, 2024
5a492e4
remove page info in content
Dec 2, 2024
ec2fbc7
comment optimized content for testing md and pkl file
Dec 2, 2024
c7613dc
comment change
Dec 2, 2024
bec1282
Merge branch 'augcog:main' into main
Catrunaround Dec 2, 2024
2e634e0
merge new version
Dec 2, 2024
7ce25f7
merge new version
Dec 2, 2024
50a468c
add new structure to pkl structure check
Dec 4, 2024
3fdf112
delete testing file
Dec 4, 2024
f778f2d
Merge branch 'augcog:main' into main
Catrunaround Dec 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

This file was deleted.

Binary file not shown.
Binary file not shown.
Loading
Loading