Skip to content
This repository was archived by the owner on Jan 3, 2018. It is now read-only.

WIP: IPython parallel intermediate lesson#438

Closed
cfriedline wants to merge 12 commits intoswcarpentry:masterfrom
cfriedline:swc-ipython-parallel-lesson
Closed

WIP: IPython parallel intermediate lesson#438
cfriedline wants to merge 12 commits intoswcarpentry:masterfrom
cfriedline:swc-ipython-parallel-lesson

Conversation

@cfriedline
Copy link

WIP: This is a PR that I would like to include for intermediate users as an introduction of using the IPython parallel framework. I've tried to keep it under control to fulfill the 10-minute lesson idea, but it could easily be expanded. Looking forward to comments and critique.

Updated: Due to time constraints of Evolution 2014, I'm going to be slow on updates until the end of June, but will try to make intermediate changes as I can.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"tissue" instead of "tissues"?

@selimnairb
Copy link
Contributor

This lesson is very discipline-specific. I suggest either: (1) adding some background/links to background about the analysis; or better still (2) providing a more abstract example.

Also, having never used IPython's multiprocessing, I would like to have a bit more explanation of the essential concepts necessary for understanding what's going on. Links to relevant IPython documentation would also be helpful.

@cfriedline
Copy link
Author

Thanks for your comments, Brian. I will think about adding some more background. Perhaps the example is discipline specific, but the ability to process work in parallel is definitely not. There's lots more about IPython parallel that can be added, but I just didn't think it was appropriate for the 10-minute lesson we were assigned. I plan on adding more in the future - especially with regard to computing across physical machines, which is a much lower barrier to entry using IPython parallel than something like Celery (IMO). I'll work on some of the text later today and tomorrow and push up some changes.

@ChristinaLK
Copy link

I love the idea - I didn't realize you could do parallel processing with ipython, so this is great. I will definitely come back and play around with these ideas.

Two things that would make the lesson clearer for me:

  • Include a very simple definition/diagram of parallel processing (i.e. the difference between running a piece of code on one file at a time vs. running it on multiple files simultaneously).
  • It's not clear upfront what you're trying to parallelize. I went through most of the lesson wondering if you were planning to compare files. I would either choose a much simpler example or bring the python code that you're going to run on the files up towards the beginning of the lesson and then explain, this is what I want to do in parallel.

@cfriedline
Copy link
Author

Thanks, @ChristinaLK! IPython.parallel is so powerful, and I hope I can make the lesson more clear so it's accessible to everyone. When I get back from field work (5/18), I'm going to give the lesson a good overhaul with everyone's comments from above as well as from teaching the lesson with @gvwilson last week.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symmetric multiprocessing is not the same thing as "Shared memory multiprocessing", which is closer to what you're describing here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing some forms of acceleration from manycore processors, you may want to consider discussing PyOpenCL or theano here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're reassigning the line variable here, you might want to consider creating a new variable composed of the list of integers. You're also using a list comprehension, which isn't always covered in a Python tutorial. Not necessarily a bad thing, but something to be aware of.

@ahmadia
Copy link
Contributor

ahmadia commented May 14, 2014

Thanks for the PR @cfriedline, this is a well-designed small introduction to some of the parallel capabilities of the IPython Notebook.

This seems to have a lot in common with #417, particularly some of the introductory material. I think the redundancies between the two lessons should be resolved, and they should perhaps be integrated into a common parallel module, with room for introducing tools like mpi4py and similar.

As it stands, this lesson seems to be a little too domain-specific to be of general use in a Software Carpentry workshop, but it sounds like there has been considerable demand for parallel computing lesson materials. I think this could use more polishing before it's merged in.

@cfriedline
Copy link
Author

Thanks for your comments @ahmadia. I agree that there are some redundancies and need for expansion, but please keep in mind that this lesson was designed for the instructor course and does 1) assume some proficiency with python/shell and 2) fulfill a requirement for a 10 minute lesson. When I get back from the field, I plan on updating to be a more complete treatment of parallel computing with IPython. The example was meant to be more substantial than your typical "fibonacci" hello-word type thing, reflecting something that biologists dealing with next generation sequencing might encounter. However, I don't think it's completely redundant with #417 for several reasons, the most significant is that once concepts are learned, they can easily be scaled across multiple physical boxes (cluster engines distributed via SSH, SGE, or MPI) - something that's not necessarily possible with multiprocessing.

I'd agree that there might be a need for a separate parallel module, rather than rolling up into just "intermediate." I'd be happy to contribute anything IPython.parallel there if that's what people want to do.

Look more more commits next week. Thanks.

@ahmadia
Copy link
Contributor

ahmadia commented May 14, 2014

@cfriedline - if this went into some sort of directory that indicated it wasn't part of the standard lesson plan, I'd be +1 for merging it in (modulo a little bit of refactoring/integration with the multiprocessing lesson). My primary caution is in pushing material into the overstuffed "intermediate" lesson materials plan.

@gvwilson gvwilson changed the title IPython parallel intermediate lesson (from teaching assignment 8.5) IPython parallel intermediate lesson May 21, 2014
@cfriedline cfriedline changed the title IPython parallel intermediate lesson WIP: IPython parallel intermediate lesson Jun 9, 2014
@ethanwhite
Copy link
Contributor

On May 14, 2014 11:20 AM, "Aron Ahmadia" notifications@github.com wrote:

@cfriedline - if this went into some sort of directory that indicated it
wasn't part of the standard lesson plan, I'd be +1 for merging it in
(modulo a little bit of refactoring/integration with the multiprocessing
lesson). My primary caution is in pushing material into the overstuffed
"intermediate" lesson materials plan.

+1 for merging into a subdirectory.

@gvwilson
Copy link
Contributor

I have submitted a PR to @cfriedline with updates to this - once that's merged, we'll merge this one.

@ethanwhite
Copy link
Contributor

It looks like the PR mentioned by @gvwilson was merged in a couple of months ago, so I think this is ready to be merged.

@ahmadia
Copy link
Contributor

ahmadia commented Sep 27, 2014

This is still heading into intermediate/python, which was my primary objection.

@ethanwhite
Copy link
Contributor

@ahmadia I agree (as I did a few months ago). That said, with a potential repo split coming up I'd say better to get it in as is that not at all. We can easily move it after it's merged.

@cfriedline Any chance you could tack on one more commit to move this into intermediate/python/extras/?

@ethanwhite
Copy link
Contributor

I'd say we should go ahead and merge this and then move it after the merge.

@gvwilson
Copy link
Contributor

gvwilson commented Oct 7, 2014

If @abostroem and/or @tbekolay agree, please go ahead and push the big
green button.

@abostroem
Copy link
Contributor

I'll need a few days to read over this (grad school just hit me)

@gvwilson
Copy link
Contributor

gvwilson commented Feb 3, 2016

Pointer relocated to swcarpentry/python-intermediate#11

@gvwilson gvwilson closed this Feb 3, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants