Skip to content

jyknight/llvm-git-migration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Some current outstanding issues:

1. DONE.

   Need to get a process in place to keep the author map data
   updated. Ideally, the maintainers of the official file would check
   in changes somewhere? In the meantime, new authors make the script
   stop until it's fixed.

2. DONE. Deployed on GCE VM.

   Need to deploy this process to a VM somewhere, and make sure it
   keeps running.

3. DEFERRED -- can be done at any point in the future.

   Can/should we get rid of some of the branches (not the release_*
   branches, but the miscellaneous other ones)? There's a whole ton of
   them which are entirely useless. Then there's all the "svntag/*"
   branches which came from svn "tag"s, but which weren't actually
   tags.

4. Decide if the flat repository layout chosen here (top-level
   projects each exist as a top-level directory) is the final
   repository layout. If not, adjust layout to match final decision.

5. DONE ages ago, but I failed to notice.
   (e.g. cmake -DLLVM_ENABLE_PROJECTS=clang)

   Create better build system support for the chosen repository
   layout. E.g. with this layout, you shouldn't need to manually
   symlink things into random places under another project in the
   source tree to make the build tools build everything.  At the same
   time, it (probably) shouldn't be the case that everything builds
   all the time. ?Top-level cmake file that has options like
   -DWITH_CLANG -DWITH_LLD? ?Something else?

...And, finally...are we actually going to do monorepo? Who's going to
make that decision?


Re #3: starting a list of useless branches:
llvm
poolalloc
PowerPC_0
svntag/comeback
svntag/intel_release
svntag/JTC_POOL_WORKS
svntag/main_lastmerge
svntag/May2007
svntag/OldStatistics
svntag/PA_111
svntag/PA_112
svntag/pldi2005
svntag/PowerPC_0_0
svntag/rel19_lastmerge
svntag/start



===== Previous notes =====
1. I use trac's sqlite database of svn metadata to ease lookup of events in svn history.

So, you can run queries like:
  select * from node_change where path regexp '^llvm/branches/[^/]*$' order by path,rev;
to find all such changes, across all versions.

update-repo.sh updates my local svn repository, and the trac db.

2. There's a script, extract-author-ids.py, to generate an author map
from the local svn repo (which uses usernames) and the git repos
(which use email addresses).

This attempts to handle that people changed email addresses in the
past. I'm not sure if we actually want to use historical emails, or
just give everyone's historical contributions their current
email.

TBD, but the info is there.


3. After attempting to use a few tools (git-svn, reposurgeon), and
having them blow up in varied ways, it appears as if the a tool that
will actually work is the KDE project's "svn2git" aka
"svn-all-fast-export" project. It can be found at
<https://github.com/svn-all-fast-export/svn2git.git>.

...needed a couple patches, use the version at:
<https://github.com/jyknight/svn2git.git>

Working on a rules file for it in llvm-svn2git.rules, and a driver
script in 'llvm-svn2git.sh'

...Welp, then turned out I also needed to rewrite 'git filter-branch',
because it's horribly horribly slow. The generic infrastructure for
that lives in fast_filter_branch.py, the llvm-specific rules in
llvm_filter.py.

I also used that module for fixup-tags.py.

More things to be done:

1. Editing commit messages (do we want to replace r12345 -> git
   hash?). Decided this is probably not worthwhile. Mapping from svn
   rev to githash is easy enough, e.g.:
       git show ':/llvm-svn: 3009\b'
       git rev-parse ':/llvm-svn: 3009\b'

2. Extracting authorship information from commit message "Patch by
   XXXX!" lines. (Decided that we're NOT going to do this, since
   that's not actually reliable enough anyways)

3. Add support for svn:mergeinfo to svn-all-fast-export (?)

4. Fix CVS disasters from before cvs2svn conversion
   First SVN revision: r37801, Fri Jun 29 16:35:07 2007 +0000
   RELEASE_21 was first release.
   Up through: llvm-commits/Week-of-Mon-20070625.txt.gz
   r3583 was first in mail-archives, Fri Sep 6 02:02:58 2002 +0000
   Done.


Branchpoints:
LLVM 2.1:
   Date:   Thu Sep 13 01:26:11 2007 +0000
   llvm-svn=41908

LLVM 2.0: (release_20~1)
   Date:   Tue May 8 02:19:56 2007 +0000
   llvm-svn=36919

LLVM 1.9: (release_19~9)
   Date:   Tue Nov 7 04:12:03 2006 +0000
   llvm-svn=31488

LLVM 1.8: (release_18~1)
   Date:   Thu Jul 27 04:24:14 2006 +0000
   llvm-svn=29324

LLVM 1.7: (release_17)
   Date:   Thu Apr 13 19:50:07 2006 +0000
   llvm-svn=27674

LLVM 1.6: (release_16)
    Date:   Thu Oct 27 16:30:44 2005 +0000
    llvm-svn=24041

LLVM 1.5: (release_15)
    Date:   Fri May 13 17:42:54 2005 +0000
    llvm-svn=21942

LLVM 1.4: (release_14)

git checkout origin/release_20~1
diff -r -b ../llvm-2.0/ llvm/

git checkout origin/release_19~9
diff -r -b ../llvm-1.9/ llvm/

git checkout origin/release_18~1
diff -r -b ../llvm-1.8a/ llvm/

git checkout origin/release_17
diff -r -b ../llvm-1.7/ llvm/

git checkout origin/release_16
diff -r -b ../llvm-1.6/ llvm/

git checkout origin/release_15
diff -r -b ../llvm-1.5/ llvm/

git checkout origin/release_14
diff -r -b ../llvm-1.4/ llvm/

Only in ../llvm-1.4/docs: LLVMVsTheWorld.html
Only in ../llvm-1.4/runtime/GCCLibraries: libg

git checkout origin/release_13
diff -r -b ../llvm-1.3/ llvm/

Only in llvm/docs/CommandGuide: manpage.css
Only in ../llvm-1.3/runtime/GCCLibraries: libg
Binary files ../llvm-1.3/runtime/libpng/contrib/visupng/VisualPng.ico and llvm/runtime/libpng/contrib/visupng/VisualPng.ico differ
Binary files ../llvm-1.3/runtime/libpng/projects/beos/x86-shared.proj and llvm/runtime/libpng/projects/beos/x86-shared.proj differ
Binary files ../llvm-1.3/runtime/libpng/projects/beos/x86-static.proj and llvm/runtime/libpng/projects/beos/x86-static.proj differ
Binary files ../llvm-1.3/runtime/zlib/contrib/vstudio/vc7/gvmat32.obj and llvm/runtime/zlib/contrib/vstudio/vc7/gvmat32.obj differ
Binary files ../llvm-1.3/runtime/zlib/contrib/vstudio/vc7/inffas32.obj and llvm/runtime/zlib/contrib/vstudio/vc7/inffas32.obj differ
Only in ../llvm-1.3/test: QMTest

git checkout origin/release_12
diff -r -b ../llvm-1.2/ llvm/

Only in ../llvm-1.2/runtime/GCCLibraries: libg
Binary files ../llvm-1.2/runtime/libpng/contrib/visupng/VisualPng.ico and llvm/runtime/libpng/contrib/visupng/VisualPng.ico differ
Binary files ../llvm-1.2/runtime/libpng/projects/beos/x86-shared.proj and llvm/runtime/libpng/projects/beos/x86-shared.proj differ
Binary files ../llvm-1.2/runtime/libpng/projects/beos/x86-static.proj and llvm/runtime/libpng/projects/beos/x86-static.proj differ
Binary files ../llvm-1.2/runtime/zlib/contrib/vstudio/vc7/gvmat32.obj and llvm/runtime/zlib/contrib/vstudio/vc7/gvmat32.obj differ
Binary files ../llvm-1.2/runtime/zlib/contrib/vstudio/vc7/inffas32.obj and llvm/runtime/zlib/contrib/vstudio/vc7/inffas32.obj differ
Only in ../llvm-1.2/test: Programs
Only in ../llvm-1.2/test: QMTest


git checkout origin/release_11
diff -r -b ../llvm-1.1/ llvm/

Only in ../llvm-1.1/docs: LLVMVsTheWorld.html
Only in ../llvm-1.1/runtime/GCCLibraries: libg
Only in ../llvm-1.1/test: Programs
Only in ../llvm-1.1/test: QMTest

git checkout origin/release_10
diff -rbq ../llvm-1.0/ llvm/

Files llvm/docs/ReleaseNotes.html and ../llvm-1.0/docs/ReleaseNotes.html differ
Only in ../llvm-1.0/runtime/GCCLibraries: libg
Only in ../llvm-1.0/test: Programs
Only in ../llvm-1.0/test: QMTest
Only in llvm/test/Regression/Jello: 2003-01-04-PhiTest.ll
Only in ../llvm-1.0/test/Regression/Jello: 2003-01-04-PhiTest.llx
Only in llvm/test/Regression/Jello: 2003-10-18-PHINode-ConstantExpr-CondCode-Failure.ll
Only in ../llvm-1.0/test/Regression/Jello: 2003-10-18-PHINode-ConstantExpr-CondCode-Failure.llx



============
NOTES about historical events discovered trawling through the trac db:
============

Note: When you have a node_change="M" (move) it is an Add on "path",
and at the same time, a Delete on "base_path" (That's the only
action type that modifies base_path rather than path).

Weird events in history which need to be corrected in git migration:

cfe.dead --
   cfe/ was initially under an extra directory: cfe/cfe/trunk.
   cfe/ was copied to cfe.dead/ @39733.
   cfe/ was deleted @39734.
   cfe.dead/cfe was copied to cfe/ @39734
   cfe.dead/ was deleted @39736
  ---> Should preserve history of cfe/ through this transition.

cfe.dead/cfe was copied to:
1|0000038531|cfe|D|A||-1
1|0000039733|cfe.dead|D|M|cfe|39731
1|0000039734|cfe|D|M|cfe.dead/cfe|39733
1|0000039736|cfe.dead|D|D|cfe.dead|39735

dragonegg --
  Renamed from gcc-plugin at r83830, should keep together.

1|0000083830|dragonegg|D|M|gcc-plugin|83829


Other deleted toplevel dirs:
elp -- removed in r154133, no content
llvm-gcc-4-2 -- renamed to llvm-gcc-4.2 at r39966
lold -- was old name for lld, renamed at r146828. No useful history prior to that.
tmp -- trash.
core # Abandoned modularization of llvm
sample # Abandoned modularization of llvm
support # Abandoned modularization of llvm
llvm-top # Abandoned modularization of llvm
website # Not actually the website (that's in www)

Proposed to migrate into mono-repo:
cfe # Rename to clang
clang-tools-extra # Basically a part of clang...
compiler-rt
debuginfo-tests # Also basically part of clang...
dragonegg
libclc
libcxx
libcxxabi
libunwind
lld
lldb
llgo
llvm
openmp
polly
parallel-libs
poolalloc

Migrate into mono-repo, but delete retroactively:
hlvm # delete after r41689 (but filter out later r160869 and r160870 "test commits")
java # delete after r26059
stacker # delete after r40406
vmkit # delete after r219392


Migrate to a separate repositories:
lnt
test-suite
www # The website
www-pubs # More the website
www-releases # More the website
zorg # The buildbot infrastructure

Don't make a git conversion at all:
clang-tests # Copy of GCC 4.2 testsuite, modified to work with clang
clang-tests-external # Copy of GDB testsuite
giri # Never actually developed here; actually https://github.com/liuml07/giri
klee # Already migrated to github with history; https://github.com/klee/klee
llvm-gcc-4.0 # GCC 4.0, modified for llvm
llvm-gcc-4.2 # GCC 4.2, modified for llvm
llbrowse # An LLVM bitcode GUI browser
television # A different LLVM GUI browser; shows effects of transforms, etc.
nightly-test-server
safecode # Contains a whole copy of clang??




Some things under "branches" and "tags" are directories with other stuff in them, not actually a copy of trunk.

sqlite> select * from node_change where path regexp "/branches/[^/]*$" and change_type="A";
1|0000041911|test-suite/branches/release_21|D|A||-1
1|0000041913|llvm/branches/release_21|D|A||-1
1|0000046175|llvm/branches/Apple|D|A||-1
1|0000048561|llvm/branches/ggreif|D|A||-1
1|0000050231|llvm-gcc-4.2/branches/Apple|D|A||-1
1|0000062797|cfe/branches/Apple|D|A||-1
1|0000087875|safecode/branches/unlabeled-1.1.2|D|A||-1
1|0000101452|llvm-gcc-4.2/branches/ggreif|D|A||-1
1|0000101456|cfe/branches/ggreif|D|A||-1
1|0000101458|test-suite/branches/ggreif|D|A||-1
1|0000104458|llvm/branches/wendling|D|A||-1
1|0000124731|lldb/branches/apple|D|A||-1
1|0000134073|llvm/branches/GSoC|D|A||-1
1|0000168115|llvm/branches/google|D|A||-1
1|0000168152|cfe/branches/google|D|A||-1
1|0000168153|clang-tools-extra/branches/google|D|A||-1
1|0000168154|compiler-rt/branches/google|D|A||-1
1|0000177017|lldb/branches/google|D|A||-1
1|0000177017|polly/branches/google|D|A||-1
1|0000195627|vmkit/branches/mcjit|D|A||-1

sqlite> select * from node_change where path regexp "/tags/[^/]*$" and change_type="A" and path not regexp ".*/tags/RELEASE_[0-9]*";
1|0000039937|llvm/tags/Apple|D|A||-1
1|0000043976|llvm-gcc-4.0/tags/Apple|D|A||-1
1|0000044085|llvm-gcc-4.2/tags/Apple|D|A||-1
1|0000049202|llvm/tags/checker|D|A||-1
1|0000049204|cfe/tags/checker|D|A||-1
1|0000062626|cfe/tags/Apple|D|A||-1
1|0000080611|llvm/tags/cremebrulee|D|A||-1
1|0000080612|cfe/tags/cremebrulee|D|A||-1
1|0000083187|compiler-rt/tags/Apple|D|A||-1
1|0000168129|llvm/tags/google|D|A||-1
1|0000168158|cfe/tags/google|D|A||-1
1|0000168159|clang-tools-extra/tags/google|D|A||-1
1|0000168160|compiler-rt/tags/google|D|A||-1
1|0000177017|lldb/tags/google|D|A||-1
1|0000177017|polly/tags/google|D|A||-1

sqlite> select * from node_change where path regexp "/(tags|branches)/[^/]*$" and (base_path is null or base_path not regexp ".*/(trunk|(branches|tags)/[^/]+)") and path not like "%/tags/RELEASE_%";
1|0000039937|llvm/tags/Apple|D|A||-1
1|0000041911|test-suite/branches/release_21|D|A||-1
1|0000041913|llvm/branches/release_21|D|A||-1
1|0000043976|llvm-gcc-4.0/tags/Apple|D|A||-1
1|0000044085|llvm-gcc-4.2/tags/Apple|D|A||-1
1|0000046175|llvm/branches/Apple|D|A||-1
1|0000048561|llvm/branches/ggreif|D|A||-1
1|0000049202|llvm/tags/checker|D|A||-1
1|0000049204|cfe/tags/checker|D|A||-1
1|0000050231|llvm-gcc-4.2/branches/Apple|D|A||-1
1|0000062626|cfe/tags/Apple|D|A||-1
1|0000062797|cfe/branches/Apple|D|A||-1
1|0000080611|llvm/tags/cremebrulee|D|A||-1
1|0000080612|cfe/tags/cremebrulee|D|A||-1
1|0000083187|compiler-rt/tags/Apple|D|A||-1
1|0000087875|safecode/branches/unlabeled-1.1.2|D|A||-1
1|0000101452|llvm-gcc-4.2/branches/ggreif|D|A||-1
1|0000101456|cfe/branches/ggreif|D|A||-1
1|0000101458|test-suite/branches/ggreif|D|A||-1
1|0000104458|llvm/branches/wendling|D|A||-1
1|0000124731|lldb/branches/apple|D|A||-1
1|0000134073|llvm/branches/GSoC|D|A||-1
1|0000168115|llvm/branches/google|D|A||-1
1|0000168129|llvm/tags/google|D|A||-1
1|0000168152|cfe/branches/google|D|A||-1
1|0000168153|clang-tools-extra/branches/google|D|A||-1
1|0000168154|compiler-rt/branches/google|D|A||-1
1|0000168158|cfe/tags/google|D|A||-1
1|0000168159|clang-tools-extra/tags/google|D|A||-1
1|0000168160|compiler-rt/tags/google|D|A||-1
1|0000177017|lldb/branches/google|D|A||-1
1|0000177017|lldb/tags/google|D|A||-1
1|0000177017|polly/branches/google|D|A||-1
1|0000177017|polly/tags/google|D|A||-1
1|0000195627|vmkit/branches/mcjit|D|A||-1



Looking at this to check for sanity of conversion:
git log -m --no-renames --pretty=format: --name-only --diff-filter=A --all | pyuniq > ../all-files
for x in $(git show-ref|cut -d' ' -f2); do git ls-tree -r --full-tree $x; done |cut -f2 | pyuniq > ../head-files

grep HistoricalNotes/2000-11-18-EarlyDesignIdeas.txt ../all-files
grep include/clang/Basic/SourceManager.h ../all-files

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published