Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SGE Plugin hardcodes to TACC parallel environments #87

Open
drelu opened this issue Feb 19, 2013 · 9 comments
Open

SGE Plugin hardcodes to TACC parallel environments #87

drelu opened this issue Feb 19, 2013 · 9 comments

Comments

@drelu
Copy link
Member

drelu commented Feb 19, 2013

Make parallel environment for SGE configurable so that machines other than Lonestar can use the SGE plugin. On Morar e.g. the following parallel environments are supported:

make
mpi
mpi-128
omp
omp-128
smp

@oleweidner
Copy link
Contributor

Hi Andre,

can you please elaborate how these environments are configured if you use them via plain SGE (i.e., without Bliss). Is that something that goes into the job script? Do you suggest that 'mpi, mpi-128, omp,…' should be options for the the saga 'jd.spmd_variation' field?

I'm working on the SGE adaptor at the moment, so the timing is perfect ;-)

Thanks,
Ole

On Feb 19, 2013, at 23:38 , Andre Luckow [email protected] wrote:

Make parallel environment for SGE configurable so that machines other than Lonestar can use the SGE plugin. On Morar e.g. the following parallel environments are supported:

make
mpi
mpi-128
omp
omp-128
smp


Reply to this email directly or view it on GitHub.

@drelu
Copy link
Member Author

drelu commented Feb 20, 2013

Hi Ole,
the available pe can be queried via:

qconf -spl
10way
11way
12way
1way
24way
2way
4way
6way
8way

Since it is a string, it should be provided by the user.

Thanks!

Best,
Andre

On Wed, Feb 20, 2013 at 2:23 AM, Ole Weidner [email protected]:

Hi Andre,

can you please elaborate how these environments are configured if you use
them via plain SGE (i.e., without Bliss). Is that something that goes into
the job script? Do you suggest that 'mpi, mpi-128, omp,…' should be options
for the the saga 'jd.spmd_variation' field?

I'm working on the SGE adaptor at the moment, so the timing is perfect ;-)

Thanks,
Ole

On Feb 19, 2013, at 23:38 , Andre Luckow [email protected]
wrote:

Make parallel environment for SGE configurable so that machines other
than Lonestar can use the SGE plugin. On Morar e.g. the following parallel
environments are supported:

make
mpi
mpi-128
omp
omp-128
smp


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com//issues/87#issuecomment-13819202.

@oleweidner
Copy link
Contributor

Hi Andre,

how's 'pe' related to the previously mentioned 'mpi, mpi-128, omp,…'?

How do you think this should be mapped to saga?

Thanks,
Ole

On Feb 20, 2013, at 13:14 , Andre Luckow [email protected] wrote:

Hi Ole,
the available pe can be queried via:

qconf -spl
10way
11way
12way
1way
24way
2way
4way
6way
8way

Since it is a string, it should be provided by the user.

Thanks!

Best,
Andre

On Wed, Feb 20, 2013 at 2:23 AM, Ole Weidner [email protected]:

Hi Andre,

can you please elaborate how these environments are configured if you use
them via plain SGE (i.e., without Bliss). Is that something that goes into
the job script? Do you suggest that 'mpi, mpi-128, omp,…' should be options
for the the saga 'jd.spmd_variation' field?

I'm working on the SGE adaptor at the moment, so the timing is perfect ;-)

Thanks,
Ole

On Feb 19, 2013, at 23:38 , Andre Luckow [email protected]
wrote:

Make parallel environment for SGE configurable so that machines other
than Lonestar can use the SGE plugin. On Morar e.g. the following parallel
environments are supported:

make
mpi
mpi-128
omp
omp-128
smp


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com//issues/87#issuecomment-13819202.


Reply to this email directly or view it on GitHub.

@drelu
Copy link
Member Author

drelu commented Feb 21, 2013

Hi,
there is no right way to map this to SAGA. AndreM will hit me for this, but
I think an SGE specific extension attribute is the right place for this way
of specifying a parallel environment.

Best,
Andre

On Wed, Feb 20, 2013 at 7:28 AM, Ole Weidner [email protected]:

Hi Andre,

how's 'pe' related to the previously mentioned 'mpi, mpi-128, omp,…'?

How do you think this should be mapped to saga?

Thanks,
Ole

On Feb 20, 2013, at 13:14 , Andre Luckow [email protected]
wrote:

Hi Ole,
the available pe can be queried via:

qconf -spl
10way
11way
12way
1way
24way
2way
4way
6way
8way

Since it is a string, it should be provided by the user.

Thanks!

Best,
Andre

On Wed, Feb 20, 2013 at 2:23 AM, Ole Weidner [email protected]:

Hi Andre,

can you please elaborate how these environments are configured if you
use
them via plain SGE (i.e., without Bliss). Is that something that goes
into
the job script? Do you suggest that 'mpi, mpi-128, omp,…' should be
options
for the the saga 'jd.spmd_variation' field?

I'm working on the SGE adaptor at the moment, so the timing is perfect
;-)

Thanks,
Ole

On Feb 19, 2013, at 23:38 , Andre Luckow [email protected]
wrote:

Make parallel environment for SGE configurable so that machines
other
than Lonestar can use the SGE plugin. On Morar e.g. the following
parallel
environments are supported:

make
mpi
mpi-128
omp
omp-128
smp


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/saga-project/bliss/issues/87#issuecomment-13819202>.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com//issues/87#issuecomment-13829283.

@andre-merzky
Copy link
Member

Hi Andre,

On Thu, Feb 21, 2013 at 1:34 AM, Andre Luckow [email protected] wrote:

Hi,
there is no right way to map this to SAGA. AndreM will hit me for this,

Hmm, tempting... ;-)

but
I think an SGE specific extension attribute is the right place for this
way
of specifying a parallel environment.

Well, if it is needed, its needed... -- but is there a way to simply
encode this in an existing attribute, like

NUMBER_OF_PROCESSES = "24@4way"
SPMD_VARIATION = "MPI+OMP-128"

I don't really yet understand what those attributes are supposed to
specify, so the above examples are probably stupid -- but I think you
see what I am asking?

Best, Andre.

Best,
Andre

On Wed, Feb 20, 2013 at 7:28 AM, Ole Weidner
[email protected]:

Hi Andre,

how's 'pe' related to the previously mentioned 'mpi, mpi-128, omp,…'?

How do you think this should be mapped to saga?

Thanks,
Ole

On Feb 20, 2013, at 13:14 , Andre Luckow [email protected]
wrote:

Hi Ole,
the available pe can be queried via:

qconf -spl
10way
11way
12way
1way
24way
2way
4way
6way
8way

Since it is a string, it should be provided by the user.

Thanks!

Best,
Andre

On Wed, Feb 20, 2013 at 2:23 AM, Ole Weidner
[email protected]:

Hi Andre,

can you please elaborate how these environments are configured if
you
use
them via plain SGE (i.e., without Bliss). Is that something that
goes
into
the job script? Do you suggest that 'mpi, mpi-128, omp,…' should be
options
for the the saga 'jd.spmd_variation' field?

I'm working on the SGE adaptor at the moment, so the timing is
perfect
;-)

Thanks,
Ole

On Feb 19, 2013, at 23:38 , Andre Luckow [email protected]
wrote:

Make parallel environment for SGE configurable so that machines
other
than Lonestar can use the SGE plugin. On Morar e.g. the following
parallel
environments are supported:

make
mpi
mpi-128
omp
omp-128
smp


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/saga-project/bliss/issues/87#issuecomment-13819202>.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on
GitHubhttps://github.com//issues/87#issuecomment-13829283.


Reply to this email directly or view it on GitHub.

There are only two hard things in Computer Science: cache invalidation and
naming things.

-- Phil Karlton

@drelu
Copy link
Member Author

drelu commented Feb 21, 2013

I think an SGE specific extension attribute is the right place for this
way
of specifying a parallel environment.

Well, if it is needed, its needed... -- but is there a way to simply
encode this in an existing attribute, like

NUMBER_OF_PROCESSES = "24@4way"
SPMD_VARIATION = "MPI+OMP-128"

I don't really yet understand what those attributes are supposed to
specify, so the above examples are probably stupid -- but I think you
see what I am asking?

Introducing an awkward way of overload existing attributes just to avoid
touching the SAGA spec (which is very inflexible and does not allow any
extension!!). Job description attributes should be extensible!

The focus should be on the user and how to make it simple and
straightforward for him! It is difficult enough to mentally map resource
specific commands to SAGA (see Melissa's question with respect to ppn!).
The hell does not freeze over just because of an resource specific
attribute. Encoding resource specifics into a string like
number_of_processes (which is usually a number) is just a big hack and
certainly not something SAGA envisioned with providing a unified
abstraction.

Best,
Andre

@andre-merzky
Copy link
Member

On Thu, Feb 21, 2013 at 1:52 AM, Andre Luckow [email protected]:

I think an SGE specific extension attribute is the right place for this
way
of specifying a parallel environment.

Well, if it is needed, its needed... -- but is there a way to simply
encode this in an existing attribute, like

NUMBER_OF_PROCESSES = "24@4way"
SPMD_VARIATION = "MPI+OMP-128"

I don't really yet understand what those attributes are supposed to
specify, so the above examples are probably stupid -- but I think you
see what I am asking?

Introducing an awkward way of overload existing attributes just to avoid
touching the SAGA spec (which is very inflexible and does not allow any
extension!!). Job description attributes should be extensible!

Ah, please read again -- I was asking if there exists a simple way to
encode that information... Reason is this time not so much the spec, but
that things are sufficiently confusing with the number of
processor-assignment attributes we already have -- adding yet another one
which will only be usable for one specific backend won't make usage any
simpler...

The focus should be on the user and how to make it simple and

straightforward for him! It is difficult enough to mentally map resource
specific commands to SAGA (see Melissa's question with respect to ppn!).
The hell does not freeze over just because of an resource specific
attribute. Encoding resource specifics into a string like
number_of_processes (which is usually a number) is just a big hack and
certainly not something SAGA envisioned with providing a unified
abstraction.

Yes, the goal is simple usage. Funny that we have the same intentions, and
yet disagree so much on the means, isn't it :-)

Thanks, Andre.

Best,
Andre


Reply to this email directly or view it on GitHubhttps://github.com//issues/87#issuecomment-13866595.

There are only two hard things in Computer Science: cache invalidation and
naming things.

-- Phil Karlton

@drelu
Copy link
Member Author

drelu commented Feb 21, 2013

Hi Andre,

Introducing an awkward way of overload existing attributes just to
avoid

touching the SAGA spec (which is very inflexible and does not allow any
extension!!). Job description attributes should be extensible!

Ah, please read again -- I was asking if there exists a simple way to
encode that information... Reason is this time not so much the spec, but
that things are sufficiently confusing with the number of
processor-assignment attributes we already have -- adding yet another one
which will only be usable for one specific backend won't make usage any
simpler...

No, there is no simple way. The admin can define any arbitrary string, on
one machine a parallel environment might be called "mpi", on the other
"16way".

Best,
Andre

@npch
Copy link

npch commented Mar 5, 2013

Given that there is a bigger issue at hand here: "how do you keep things simple when an admin can define any set of arbitrary strings for parallel environments" what's the patch fix for this in the meantime that could be implemented for now in a fork of the Bliss codebase?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants