Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segway on SGE does not submit job arguments longer than 1024 chars #72

Closed
EricR86 opened this issue Jun 6, 2016 · 7 comments
Closed
Labels
bug Something isn't working critical

Comments

@EricR86
Copy link
Member

EricR86 commented Jun 6, 2016

Original report (BitBucket issue) by Rachel Chan (Bitbucket: rcwchan).


This is a DRMAA issue that segway is impacted by. Upon calling

#!python

> /mnt/work1/users/home2/rachelc/segway/segway/run.py(1874)queue_gmtk()
(Pdb) l
1869             job_tmpl.jobName = job_name
1870             job_tmpl.remoteCommand = ENV_CMD
1871 ->         job_tmpl.args = map(str, args)

__set__ from drmaa/helpers.py is called, and does the following:

#!python

181          def __set__(self, instance, value):
182              c(drmaa_set_vector_attribute, instance,
183  ->              self.name, string_vector(value))

string_vector(value) has no truncating effect on the passed arguments. It is drmaa_set_vector_attribute that appears to have a buffer overflow of some sort and caps all arguments at 1024 chars.

This seems like a pre-existing issue that was only more easily isolated with minibatch (presumably because the list of windows was random in length, and you were unlikely (rather than likely) to hit a partial window that already existed (for example: cutting the window number '1000' in half, without minibatch, 10 would likely be a valid window, and so ending the argument there would throw no error, but with minibatch, 10 is unlikely to be a valid window due to its random nature).

The bug can be easily recreated and shown with pdb with the following simple test:

#!python

import drmaa

def main():
    session = drmaa.Session()
    session.initialize()
    job_template = session.createJobTemplate()
    args = ["", '5,7,9,14,17,22,31,32,36,37,38,48,50,55,60,63,65,68,70,72,78,79,82,85,86,88,89,90,91,92,95,96,99,100,105,106,111,114,118,119,120,123,126,127,128,131,132,135,136,138,145,148,150,151,154,155,156,164,168,172,174,182,183,184,187,192,193,195,197,198,199,202,203,207,208,212,216,218,220,222,225,226,230,233,239,242,245,251,257,258,259,260,261,265,266,268,279,280,283,284,286,287,288,289,294,298,301,303,304,306,314,318,325,329,330,332,334,337,344,346,352,364,366,369,370,373,379,381,382,384,390,391,392,394,402,403,407,408,414,419,425,431,433,434,439,440,444,445,446,459,460,461,466,468,469,470,473,476,480,486,487,494,495,497,501,515,516,517,522,528,529,533,535,536,538,539,543,546,547,550,551,552,553,562,565,568,569,570,572,575,578,580,582,583,587,588,590,592,598,601,603,606,607,608,613,620,623,626,627,628,629,630,631,634,639,647,650,654,655,658,669,672,673,675,676,681,683,694,697,698,703,706,709,710,712,716,719,725,733,736,741,743,745,748,752,754,756,757,761,765,766,770,771,772,773,774,776,778,779,780,786,788,792,793,796,803,805,808,809,811,812,813,814,817,820,823,829,831,834,840,842,845,850,851,854,856,857,858,864,867,870,871,875,876,878,879,881,885,890,894,895,897,901,902,903,904,905,912,915,921,924,925,927,932,934,936,937,939,941,942,948,951,952,954,955,960,961,965,971,973,977,978,979,980,984,985,995,996,998,1000,1001,1004,1007,1009,1011,1012,1015,1016,1017,1018,1021,1025,1026,1027,1031,1035,1040,1046,1047,1048,1051,1052,1053,1056,1057,1058,1061,1065,1066,1071,1072,1085,1087,1088,1089,1092,1093,1094,1095,1098,1102,1103,1106,1108,1111,1112,1117,1118,1123,1127,1133,1135,1136,1143,1146,1147,1151,1153,1154,1156,1160,1161,1163,1164,1165,1172,1176,1179,1180,1181,1183,1184,1185,1188,1189,1190,1191,1192,1193,1194,1200,1201,1204,1206,1207,1209,1210,1211,1212,1216,1219,1222,1226,1237,1240,1243,1244,1245,1248,1250,1253,1255,1256,1257,1261,1268,1269,1276,1281,1283,1286,1290,1292,1295,1296,1299,1301,1302,1303,1307,1310,1321,1322,1329,1332,1335,1337,1342,1343,1344,1348,1350,1353,1354,1357,1358,1359,1360,1372,1373,1376,1377,1384,1385,1390,1391,1393,1394,1398,1399,1401,1408,1409,1413,1414,1417,1419,1425,1431,1435,1436,1439,1440,1443,1444,1446,1447,1451,1452,1459,1460,1461,1465,1466,1469,1470,1471,1475,1476,1477,1479,1481,1485,1486,1487,1489,1491,1492,1495,1499,1500,1503,1505,1508,1513,1516,1528,1531,1532,1538,1547,1549,1550,1552,1554,1556,1561,1564,1566,1569,1573,1584,1585,1586,1590,1591,1593,1595,1596,1597,1599,1601,1606,1610,1611,1613,1614,1617,1621,1623,1626,1628,1630,1639,1640,1644,1647,1648,1651,1652,1655,1659,1660,1664,1668,1674,1679,1681,1684,1691']
    job_template.args = map(str, args)
    print job_template.args
    session.exit()

if __name__=='__main__':
    main()

shows clearly that job_template.args is truncated by set_vector_attribute. It can also be speculated that the unicode memory issue in issue 60 (#60) could be caused by this buffer overflow as well.

@EricR86
Copy link
Member Author

EricR86 commented Jun 7, 2016

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


  • set assignee_account_id to "557058:e439e22e-8cfc-4cf1-b090-030d33a0730e"
  • set assignee to "rcwchan (Bitbucket: rcwchan)"

@EricR86
Copy link
Member Author

EricR86 commented Jun 7, 2016

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


It is worth noting that c(drmaa_set_vector_attribute ... seems to set the python attribute to unicode from the c itself and not in the drmaa python code.

@EricR86
Copy link
Member Author

EricR86 commented Jun 7, 2016

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


Arguments in bash also do not seem to have this limit. You can, for example, write a script to echo it's first argument and pass in an argument longer than 1024 characters.

@EricR86
Copy link
Member Author

EricR86 commented Jun 7, 2016

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


This is also seems not to be a limit on qsub itself in SGE. I can submit jobs with arguments longer than 1024 characters and get the expected output (e.g. by echoing the first argument)

@EricR86
Copy link
Member Author

EricR86 commented Jun 7, 2016

Original comment by Rachel Chan (Bitbucket: rcwchan).


DRMAA_python issue submitted here.

@EricR86
Copy link
Member Author

EricR86 commented Jun 7, 2016

Original comment by Rachel Chan (Bitbucket: rcwchan).


This bug also occurs on h4h (PBS Torque system). I've emailed the issue to the support email for pbs-drmaa.

@EricR86
Copy link
Member Author

EricR86 commented Jul 26, 2016

Original comment by Eric Roberts (Bitbucket: ericr86, GitHub: ericr86).


  • changed state from "new" to "resolved"

Resolved in Pull Request #55

The DRMAA issues themselves are still outstanding but this is a suitable workaround that should completely ignore this corner case.

@EricR86 EricR86 closed this as completed Jul 26, 2016
@EricR86 EricR86 added critical bug Something isn't working labels Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working critical
Projects
None yet
Development

No branches or pull requests

1 participant