You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some jobs require multiple files to be created where the file name is derived
from the data. Something like this
... -> transform [$.filepart, $.data]
-> write( dataDrivenFd(baseFd('/foo/*.dat')) )
where baseFd produces a FileOutputFormat, and dataDrivenFd will replace the '*'
with $[0] and write the data from $[1] of each element. There are some tricky
issues with the number of open files, the same file being written on different
nodes (map/reduce output would require ./part-##### files), the OutputCommitter
would need to be special etc. We should treat the case of partitioned and
grouped fileparts specially so we open only one file at a time and don't
require the part-##### files.
Original issue reported on code.google.com by [email protected] on 7 Jul 2010 at 12:51
The text was updated successfully, but these errors were encountered:
Original issue reported on code.google.com by
[email protected]
on 7 Jul 2010 at 12:51The text was updated successfully, but these errors were encountered: