2.14 ER 拆分

2.14.1 普通ER拆分

当我们需要两个表要进行join的时候,由于数据被分配到不同的节点上，普通的nest loop 方式可能效率会很低。
如果我们能把需要join的表按照统一规则划分到相同的区上，就能大概率的解决这一问题。
举个简单的例子如下：

这样两张表具有外键关系，我们可以仿照聚簇的方式按照对应列来进行拆分，这样就可以不跨库实现join了。
如下图:

要实现这样的ER功能，需要如下配置。

<table name="sales" dataNode="dn1,dn2" rule="sharding">
<childTable name="sales_detail" joinKey="sales_detail_pos_num" parentKey="sales_pos_num"/>
</table>

2.14.2 智能的ER关系

当我们有多个不同的表时，上面的配置方式有点难以使用了。
这种情况下，如果2张或者多张表在mycat上的分片规则相同并且具体分片也相同，即使没有配置ER关系，也会当作ER关系来处理。
举例如下配置（片段）：

<!--schema片段-->
<table name="tableA" dataNode="dn1,dn2" rule="ruleA" />
<table name="tableB" dataNode="dn1,dn2" rule="ruleB" />
<table name="tableC" dataNode="dn2,dn1" rule="ruleC" />
<table name="tableD" dataNode="dn3,dn4" rule="ruleA" />
<table name="tableE" dataNode="dn1,dn2" rule="ruleA" />
<table name="tableF" dataNode="dn1,dn2" rule="ruleF" />
 
<!--rule片段-->
<tableRule name="ruleA">
   <rule>
      <columns>id_a</columns>
      <algorithm>hash_function</algorithm>
   </rule>
</tableRule>
<tableRule name="ruleB">
   <rule>
      <columns>id_b</columns>
      <algorithm>hash_function</algorithm>
   </rule>
</tableRule>
<tableRule name="ruleC">
   <rule>
      <columns>id_c</columns>
      <algorithm>hash_function</algorithm>
   </rule>
</tableRule>
<tableRule name="ruleF">
   <rule>
      <columns>id_a</columns>
      <algorithm>enum_par</algorithm>
   </rule>
</tableRule>
<function name="enum_par"
   class="io.mycat.route.function.PartitionByFileMap">
   <property name="mapFile">partition-hash-int.txt</property>
</function>
<function name="hash_function" class="io.mycat.route.function.PartitionByLong">
   <property name="partitionCount">2</property>
   <property name="partitionLength">512</property>
</function>

最终会得出这样一个映射关系，识别分组会根据数据分布结点和function的唯一性将表分为几组，同一组的才会有ER关系。

table名	拆分列	数据分布结点	function	识别分组
tableA	id_a	dn1,dn2	hash_function	1
tableB	id_b	dn1,dn2	hash_function	1
tableC	id_c	dn2,dn1	hash_function	2
tableD	id_a	dn3,dn4	hash_function	3
tableE	id_a	dn1,dn2	hash_function	1
tableF	id_a	dn1,dn2	enum_par	4

即，ER关系集合为：

<tableA.id_a , tableB.id_b, tableE.id_a >

<tableC.id_c>

<tableD.id_a>

<tableF.id_a>

PS：此处略去了schema，实际实现需要标识schema防止重复，ER关系是到列的，如果关联关系不是上述表对应的列，也不会视为ER.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.14_ER_Split.md

2.14_ER_Split.md

2.14 ER 拆分

2.14.1 普通ER拆分

2.14.2 智能的ER关系

Files

2.14_ER_Split.md

Latest commit

History

2.14_ER_Split.md

File metadata and controls

2.14 ER 拆分

2.14.1 普通ER拆分

2.14.2 智能的ER关系