Report of the Senate Information and Communications Technology Committee, 2012-2013

September 2013


This past year the IT Committee considered the following issues:

Online Education

See the reading list on online learning:

Sree Sreenivasan, then Chief Digital Officer and Professor of Professional Practice in Journalism, spoke with the Committee on September 7 about Columbia’s digital initiatives and again on February 15 at a joint meeting of the IT, Education and Libraries Committees. Prof. Sreenivasan said that he had three priorities in terms of online spaces:

He said that Columbia was talking to all these organizations about what the University should be doing. Prof. Sreenivasan said that while it was easy to get caught up in the MOOC conversation, he still thought there was something magical about the classroom experience with a gifted teacher. He felt that Columbia should experiment with multiple concepts. Members of the Committee asked how the current programs at Columbia for online learning in the Medical School, Business School, SEAS and Continuing Education were being integrated into the University’s plans.

It was noted that there is also a Senate task force on online learning and that the Senate’s inquiry into online learning had actually started ten years ago at Columbia with Fathom -- the Information Technology Committee had begun as the Fathom committee. Fathom ended up costing Columbia $30 million. Developing online courses is quite expensive now, with Coursera estimating the cost at $30,000, although some find it much more expensive.

In the February 2013 meeting, Prof. Sreenivasan and David Madigan, then chair of the Statistics Department, and chair of the Provost’s Faculty Advisory Committee on Online Learning, updated the Committees on Columbia’s investigations of online learning. In particular, the University had decided to partner with Coursera to offer four initial MOOCs. At the time, Columbia was also hoping to do a pilot study with EdX.

The speakers noted that online learning was extremely popular: the Khan Academy boasted 230 million views at the time. Columbia currently boasts a number of online learning programs. At SEAS, there are 14 master’s degrees that one can earn remotely. The School of Continuing Education has hybrid programs. Journalism is giving online courses. At P&S, all lectures are recorded and transcribed. Business and SIPA, the Center for Environmental Research, the French Department and many other entities at Columbia are already involved in online learning one way or another. They noted that there is considerable alumni interest in the College’s Core Curriculum. Others pointed out that there was evidence that students in “flipped” classrooms, where they listen to lectures online before class meetings and use class time for discussion, do better than those who spend class time listening to lectures.

One of Columbia’s faculty who are preparing MOOC classes with Coursera spoke about online learning issues from the faculty point of view. Michael Collins, Vikram S. Pandit Professor of Computer Science, was then preparing a course on natural language processing for presentation in the Spring 2013 term. (N.B.: This course ended up with an enrollment of 65,000.) He said that the first week of preparing the course was brutal, and even well into the effort it was still taking up all his time -- much more time than a regular course. Much of the effort was due to filming and editing.

Several issues were raised by student Senators about the advantages of online learning from the student perspective:

Concerns noted by the Committees included:

Shared Research Computing

David Madigan, head of the Shared Research Computing Policy Advisory Committee (, spoke with the IT Committee about this initiative to provide shared computing facilities to Columbia Morningside faculty at Columbia. This committee comprises faculty and CUIT and IT personnel. Its recommendations are passed on to senior University administrators.

Madigan’s committee met four or five times last year and formed six working groups:

As of October 2012, this was the state of the various groups:

The external group, led by Kathryn Johnston of the Astronomy Department, has examined what Columbia’s peers are doing. Princeton has built a building and has a policy on shared machines, but the rules are very loose. Jerry Ostriker, now in the Columbia Astronomy Department but a former provost at Princeton, is on the committee, and says it works well at Princeton. The Princeton policy is to encourage investigators to invest in bigger machines and share them. The university gives them some money and people to run the machines. Stanford does the same thing – it’s a common model. Columbia has historically been wary of this model and wants to proceed incrementally since it’s hard to predict what research will involve three to five years from now.

Some people think cloud computing will not be significant in shared computing: it can be expensive to move data as well as to access it. However, it’s unclear that the cost will remain prohibitive. This working group started a cloud pilot project with Senior Systems Engineer Rob Lane. Three years ago, Statistics and Astronomy bought a machine to share, called Hotfoot. It worked extremely well for a while. A year ago they expanded to include social sciences and the results have been mixed. The needs of the different groups are different. Google is interested in piloting with the group, with work shunted to the cloud. The pricing is attractive if we can get past privacy and other concerns. There are issues of accounting. This might be good for Morningside. But maybe in three to five years everything will be in the cloud.

SRCPAC is also looking at co-location. Putting machines on Morningside is expensive. Princeton is interested because they have spare capacity at locations in New Jersey. There is also NYSERNet, which has a facility in Syracuse. There is also a psychological barrier to co-location: people want to be in the same location as their computers. The idea is to institute “play nice” rules such as they have at Princeton and encourage people to put their machines there with the data center providing rackspace, networking, power and cooling. This approach buys us time by meeting current acute needs and allows us to experiment with rules.

The barrier to this data center approach is that it needs dedicated staff – 3-3.5 people to run the data center. The whole project rests on the possibility of building a team in CUIT.

Obstacles that the SRCPAC has met as of October 2012 were:

Data Governance

The IT Committee followed up on its 2011-12 research on improving Columbia’s data governance, reporting on it that year’s IT Committee report. Norberto Govin, a member of the Committee, reported on University plans to revise the Student Information System, which Mr. Govin thought should be done in consultation with experts in data management. Ron Forino, Director of Enterprise Business Intelligence Solutions in CUIT, provided some updates on the current plans, and Collibra, a Belgian company specializing in data governance, gave a presentation on their company’s offerings which might be helpful to Columbia. Collibra makes software that helps organizations set up rules and data definitions. The presenters outlined some of the functions of their system. (Their presentation is appended to this report.)  
Ron Forino of CUIT also reported to the Committee on the current status of Data Governance initiatives, including FDS reporting project publications, tools for self-service reporting, the implementation of a data stewardship process and some data quality initiatives (This presentation is appended to this report.)

Collibra follow-up. Mr. Govin said that he had talked to Collibra. They would charge $120,000 for three users. Mr. Forino said that was more than the university had paid for WEBI. It was a competitive market – we could shop around. The Committee agreed that data governance is critical to the University but, after investigating the cost of hiring a company like Collibra, it seemed unlikely that the University would be willing to invest the funds needed.

ARC and WebI

Several IT Committee meetings were devoted to investigating problems arising from Columbia’s new financial account and reporting system, ARC. Unfortunate gaps in the amount, level, and accuracy of the information provided by the new system have caused faculty and reporting staff many problems, including:

The Committee invited members of the ARC team to visit them and spoke with Anne Sullivan, Executive Vice President for Finance, and Paul Reedy, Assistant Vice President, Finance Service Management, who manages the ARC service center, on January 25, 2013, as well as Ron Forino and Ingrid Cole from CUIT. In general, Committee members who manage large and numerous research grants were the most vocal in their complaints about ARC. One Senator noted that his team had been unable to spend the money they have – they couldn’t set up accounts. Their business agreements were voided and they had to resubmit credentials. All kinds of details were not being properly managed. Everyone was complaining about the long data strings required by ARC. After three months his team did get accounts but no one in those grants could do any work for four months. They didn’t meet their milestones. They couldn’t buy software.

Mr. Reedy (whose presentation is appended) said that his team had simply translated business requirements into technical requirements. His people included business analysts. What they were hearing now was that the basics of the ARC system were working well. Processing times had stabilized. Ms. Sullivan said that from what she had heard, relative to our peers, Columbia had had fewer bumps than usual. A number of the Committee members present very strongly disagreed that problems had been minor or that progress had been made in correcting them.

It seemed to members of the Committee that the ARC team appeared finally to be talking to end users but not to understand the seriousness of the problems. They did offer to visit the various departments and walk through the issues with them; however, very few changes have resulted from this, presumably because such changes are expensive to make. ARC has a very long way to go before it is useful for faculty with research grants.

An alternative solution to relying upon the ARC interface and report generation was discussed by Ron Forino and Ingrid Cole, who have developed a drag-and-drop software system called WebI, which allows users to build report from ARC data. One can drag and drop fields into a report and add filters. Reports can be shared among users with similar requirements. However, ultimately, WebI relies upon the data it gets from ARC. If particular fields are not populated in ARC they are not available in WebI. If data is incorrect in ARC, which is still often the case, then WebI will still display bad data. The WebI data is also updated only periodically so there may be a lag in the reporting generated. To use WebI a user must first develop a universe of table definitions and relationships between those tables. However, the user or developer must first get access to the data in the FDS (Financial Data Store) to build such tables; the FDS is populated from PeopleSoft. This access appears to be difficult to obtain. It is unclear whether CUIT is continuing to work on WebI or not.

In a later meeting of the Committee with Ron Forino, the Committee was informed about a new project to produce budget reports for Principal Investigators. Paul Reedy (ARC) was working on this effort. Mr. Forino’s group had developed a custom web viewer with UNI access so that such reports didn’t have to be emailed, which would compromise their security. The group was running prototypes and were hoping to use this service for PIs in various schools. The reports are now being beta-tested by users; there are still some problems with accuracy of the underlying data and with the data presentation. For example, funding sources such as gifts are not included. ARC reporting problems such as continued aggregation of sub-award information still remain unresolved.

Data Security

Erik Decker, Assistant Director, Information Security, CUMC, and Soumitra Sengupta, Associate Clinical Professor of Biomedical Informatics and Information Security Officer, Columbia University Medical Center, spoke with the IT Committee about new standards of data security being introduced in Columbia.

Mr. Decker said that at CUMC, many faculty have appointments at Columbia-Presbyterian, and this makes it difficult to police information assets. There have been a number of security breaches, as when a computer was stolen in October 2012 containing information on approximately 5,000 individuals. CUMC had to report the breach to the Office of Civil Rights and tell the media about it. The incident was embarrassing and painful. The University could also be fined more than $1.5 million because the compromised data was all HIPAA information. The university is still talking to the Office of Civil Rights about a 2010 incident.

Mr. Decker’s group thereafter revised its policies and implemented an endpoint security campaign. Last year, 67 percent of security breaches resulted from the loss or theft of devices. These endpoints are the biggest risk to security. CUMC now wants all devices that contain sensitive data to be encrypted. Mr. Decker’s group’s encryption campaign is pursuing some 40,000 endpoint systems to ensure that they are secure. They have two programs in place, for information security and for assessment and certification. They have a four-person team that does assessment and certification, identifying and reporting risks and working out ways to reduce risk. So far, they have certified 143 systems and dropped 189. They encourage consolidation of systems.

Mr. Decker recommended that university workers use encrypted flash drives to store data. He has a Kingston data locker that cost $12, which he is giving away in a swap program. Multi-user systems have to go through a certification process for the IRB at CUMC. In August, a similar policy will go into effect for the Morningside campus.

Migrating to Google Mail

Alan Crosswell, Associate Vice President and Chief Technologist at CUIT, spoke to the Committee about the university’s forthcoming Google mail system. Currently, Columbia has 80,000 email users. He said the university was following its users, many of whom had already started using gmail’s consumer service. Columbia had negotiated with both Google and Microsoft to devise a new email system, but Google’s system was much better. The University also evaluated Microsoft Office 365, the cloud-based software system, but found that schools that had migrated to it had been disappointed.

There will be no advertising in Google’s applications for education. Columbia’s data will be stored in the U.S. Columbia’s contract is for four years, and the university maintains an “exit strategy.” The university’s use of Google Docs had been stalled because Google Docs doesn’t comply with ADA regulations requiring accessibility. The University of Michigan has discouraged the use of Google Docs because of that.

As of April 2012, Columbia’s new system had 27,000 users. The university is adopting it school by school. The medical school is not going with the Google system because Google wouldn’t sign a business affiliation agreement regarding HIPAA requirements. The Business School was not considering Google right now.

On a related note, the University is currently revising its email use policy. The Committee will be interested in discussing that in the coming year.


Julia Hirschberg, Professor of Computer Science; Chair, Department of Computer Science
Breck Witte, Director, Library Information Technology Office


Committee on Information and Communications Technology 2012-2013

Fac.                   Henry Spotnitz                                P&S
Fac.                   Itsik Pe’er n-s                                   SEAS
Fac.                   Victoria Stodden n-s                        A&S/NS
Fac.                   Julia Hirschberg, CO-CHAIR             SEAS
Fac.                   Mark Cohen                                    BUS

Stu.                   Akshay Shah                                   SEAS
Stu.                   Zahrah Taufique                             P&S

Off. Res.           Hatim Diab n-s

Libraries            Breck Witte, CO-CHAIR

Admin.              Candace Fleming n-s
Admin.              Ellen Binder n-s

Admin. staff    Norberto Govin   n-s

Alum.                Stephen Negron n-s

n-s = non-senator