Getting Started with Hardware and Software

From GEP Wiki
Jump to: navigation, search


This page is a brief introduction into the software and hardware you will need to participate in research based finishing.

Given the increasing availability of web tools and genomic browsers it is possible to participate in the GEP at all levels of annotation using only web based tools and good recordkeeping. This is not true for finishing. The suite of programs used in finishing, the phred/phrap/consed package, were initially written for UNIX based workstations and can only be run on computers that support UNIX based programs (UNIX workstations, Macs running OS X or PC's running Linux). At this time there are no plans to port the software to Windows, thus to include finishing in your curriculum your students will need access to computer hardware that can either run or properly connect to other computers which can run the phred/phrap/consed package.

The exact hardware configuration that will work best for any given college will depend mostly on the type and availability of existing computers, as well as the willingness and ability of your local computer technical support to implement and maintain any given solution.

The solutions below are divided first by the type of computer that the students will have direct access to while undertaking any finishing project. Within each group the list has been sorted, as best as possible, by increasing complexity and requirements for local support.

In general solutions in which students will be working on Apple Macintosh computers will be preferable over PC-based solutions. There are several reasons for this:

  • The phred/phrap/consed package of software can be installed directly on any Macintosh computer running any version of OS X. This avoids setting up and maintaining complicated server/client configurations (see below).
  • Users at Washington University in St. Louis have been using Macintosh computers to teach this curriculum since 2003. This means they have seen and solved many problems in getting the phred/phrap/consed package to work. This experience means that it is likely, but does not guarantee, that any problems you incur can be solved in a timely fashion.

Mac Based Solutions

Loaner computers

This is the most expensive option but greatly simplifies administration. In this case each student is lent a portable Mac that has been configured with all the finishing software. Students are required to sign out the computer and return the computer at the end of the semester in good working condition prior to release of a grade in the class.

  • Pros: very easy to install and maintain software, this technique is so simple and straightforward that local computer support while recommended is not required; there is sufficient expertise and willingness to help among the members of the GEP. Productivity is increased as students can work on projects outside of class time. Computers can support other classes during summer and off semesters. Very easy to set up and maintain computers by using computer cloning techniques available for the Mac. Avoids more complicated server/client solutions.
  • Cons: expensive. Without local technical support you will need to develop your own computer skills. Damage to computers while in student hands may lead to awkward or difficult situations
  • Current users: Washington University in St. Louis, contact: Dr. Chris Shaffer at

Student Lab with Macs running consed

This is probably the most common scenario for those colleges using Macs. The level of success you enjoy with this solution will depend primarily on the expertise of your local computer support and how the computer lab is organized. Most computer labs are set up in ways to not allow students to make permanent changes to the computers. Since students doing finishing will need to save their work and maintain access to it over the course of several weeks, computers that automatically delete all student work cannot be used for finishing. Some method will need to be devised to allow student generated finishing data to be accessible from one lab to the next. Exactly how this is accomplished will vary with each set up.

  • Pros: No new hardware required.
  • Cons: Difficult to impossible to set up persistent data. Long labs may monopolize valuable computer lab time.
  • Current users:

PC Based Solutions

PC using the liveCD

It is possible to run a virtual computer running linux on a PC to get access to the Finishing software. This "liveCD" system uses the free software VirtualBox and a downloadable image from the GEP servers.  The system in currently being evaluated and tested for widespread use, A few GEP members have tried this during the spring '09 finishing and reported satisfactory results with a few caveats. You can learn more about this solution by visiting the "Tools Under Development" menu option from the main GEP web page. If you are interested in particpating in the text please contact Wilson for instructions on obtaining the images and installation of the necessary hardware.

  • Pros: No new hardware required. Required software freely available. 
  • Cons: The more modern the PC the better. Memory intensive, 1 Gig installed memory recommended. Still under development. Flexibility to handle the unexpected random error required.

PC Lab running Linux

Some schools have PC based labs in either the Engineering or Computer Science Depts. in which the computers are running one of the various Linux distributions instead of Windows. Given enough technical support and cooperation it is fairly simple to install and run the appropriate finishing software on these computers. Usually these computers are not set up in a way to delete all student work; instead each student is given an account and some allocation of disk space.

  • Pros: No new hardware required. Students are given accounts, so persistence of data usually not a problem.
  • Cons: High level of local tech support required. So many different Linux distributions makes support from GEP problematic. Long labs may monopolize valuable computer lab time.
  • Current users:

PC Lab running Windows to connect to Workstation

While this is perhaps the most complicated technically to install and maintain it will probably be the choice of many locations, given that many biology departments will only have access to PC computers running windows. In these cases it is possible to install all the finishing software on one or a few compatible workstations and have students use the PC's to connect to the workstation over a network in order to run the appropriate software. The workstations can be located anywhere on campus but must be running either one of the compatible versions of Unix or Linux, also the owner of the workstation must be willing to give students accounts on these computers to they can log on and run the finishing software. The PC's in the lab will need special software installed to allow them to communicate effectively with the workstations. The cost and effectiveness of this solution will vary widely depending on the exact nature of the local set up and should be discussed with local tech support and those responsible for the workstations. {N.B. it is not really necessary to use PC's to connect to the server. If so desired one could use Mac to connect to the server, this might appeal to someone who has a mac lab but does not wish to, or does not have permission to install software onto the Macs. Almost all considerations apply to using either Mac's or PC's as your client}

  • Pros: Little or no new hardware required. Student data can be saved on workstation, while PC's in labs can erase all student data. Minimal added cost of hardware if none of the above three options is available.
  • Cons: High level of local tech support required, both for running workstation and communication between PC's and workstation. One workstation will handle only so many students before being overwhelmed and too slow for viable instruction, thus large numbers of students may require more than one workstation.
  • Current users: Jason Caronna has written up a nice word document describing how they set up their server/client system at Montclair State University with Charles Du. You can download the file Consed_server.doc.

Summary Table

Summary of the 4 basic solutions, expertise is level of local expertise required by at least one person, either the course instructor or other tech support person.

Type Cost Expertise in:


Unix/Linux Mac OS X Windows
Loaner Mac $$$$$ Low Low None Best and of course the most expensive.
Mac Lab $$$ Medium Medium None Your Mac support person will need Unix skills
LiveCD $ Low none Some Under development; no guarantees
Linux Lab $ Very High None None Very skilled local support needed, but cheap and fast
Windows<->Workstation1 $$ - $$$$ Low - High None Medium Students can get frustrated if workstation is too slow

1 Variability is due to wide variation in workstation cost and complexity