项目分为七个部分,如下图所示,(除了Entity为抽象概念,不关联到具体项目)。
-
text analysis: process word cutting, POS tagging, to extracting ‘poem image’ and the action of people.
-
Material Library: Poem image material is divided into 3 category, those are background database, character database, action database separatly. For example: moon, pavilion, landscape. Old, middle-aged, young people. Walking, running, arm movement etc.
-
Montion Transfer: Transfer the specific action to the specific character (for example, ‘raise hand to invite the moon’, which is an action in a poem). Using EDN Model (Everybody Dance Now), contains 4 components, which are a pose detection (openpose project), pose normalization, a GAN model mapping from pose images to a target subject's appearance, another GAN model adding additional realistic face synthesis. The model is open-source. code in[EverybodyDanceNow_reproduce_pytorch]folder。
-
Segmentation: Separate characters from the background (realizing that synthesize the characters with transparent backgrounds into animations). The model is an open-source implementation of Mask R-CNN. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.
-
Style Transfer: Tranfer scenery picture into the Chinese ancient ink painting style. Using cycle GAN model, which introducing a cycle consistency loss to constraint image transfer result and transfer back result. The dataset of source domain and target domain were crawling by ourselves. We propose a super-resolution method of positive stacked bottom for dealing the low resolution problem. The cycle GAN model is open-source. code in [cycle] folder.
-
Animation Synthesis: Synthesis process contains 3 component, that is material location, effect addition (zooming, shifting), frames to video. There can be some hard rules now, and more NLP comprehension in future.
-
Web Display: Providing project introduction, video list for watching.
-
utils:The crawler used by the project, as well as the super-resolution method based on the bottom of a positive chip, are in the [utils] folder.
-
step1: 出三个视频,分别让大家猜是哪首诗。
-
step2:讲技术流程,首先:1、text2entity,获取名词以及动词对应素材库中的素材。2、分割:将人的素材分割出来进行第三步处理,其他素材直接第四步。3、素材动态化,利用open pose + gan + face gan 生成动态素材。4、中国风风格迁移:将背景利用cyclegan(+人物)渲染成中国水墨风格。5、上述素材按照一定顺序整合成为动画(位置+缩放),输出为视频。6、前端展示,后端将视频存入数据库。
-
step3:项目意义:1、教育(故事+图片动画)。2、娱乐(体感交互+趣味竞猜)。3、弘扬传统文化。4、NLP+CV
-
step4: 展望:1、丰富素材库。2、交互(语音+动作)。3、生成动画的画面逻辑(素材的物理意义)。
-
PPT Link:
-
poster
-
前端:2个页面:1、展示页面;2、列表页面。(陈)
-
nlp:分词。(燕)
-
找古诗对应的素材(名词+动词):背景+人物:水、山、桥、房屋、月亮等,人物:老年中年青年男女。(邓)
-
后端:1、根据名词找到素材,对素材进行上述2、3、4步骤处理,写个统一的脚本;(合)2、生成动画,将其添加适当位移缩放,合并成为最终的动画。(陈)
-
github维护,分支操作 (燕)
-
poster,ppt,视频制作 (合)
-
文末彩蛋,中国风抖音。录凯哥读诗(合)