Am I not understanding this:
The record-shattering fundraising by Democratic presidential candidates Barack Obama and Hillary Clinton has reshaped the financing of presidential elections and generated breathless coverage and analysis of the otherwise arcane area of campaign finance.
Yet it’s had another consequence that has gone all but unnoticed. The campaign finance reports filed by Obama and Clinton have grown so massive that they’ve strained the capacity of the Federal Election Commission, good government groups, the media and even software applications to process and make sense of the data.
A milestone of sorts was reached earlier this year, when Obama, the Illinois senator whose revolutionary online fundraising has overwhelmed Clinton, filed an electronic fundraising report so large it could not be processed by popular basic spreadsheet applications like Microsoft Excel 2003 and Lotus 1-2-3.
Those programs can’t download data files with more than 65,536 rows or 256 columns.
***If you want to comb through Obama or Clinton’s cash, you either need to divide and import their reports section-by-section (a time-consuming and mind-numbing process) or purchase a more powerful database application, such as Microsoft Access or Microsoft Excel 2007, both of which retail for $229.
The FEC can’t afford a copy of Excel 2007? I can lend them my laptop if they need, but they better not talk while I am watching BSG, and they should be warned that Tunch loves company and will probably pester them when they come over.
Krista
Bizarre — you’d almost think that they’d have their own internal program that’d be even better than Excel. Evidently not.
And they can’t borrow my laptop. I need it for
Desktop Tower Defense and wasting time commenting on blogswork.nightjar
Oh, that money went to a much better cause in Iraq. The war effort must include Halliburton monogrammed towels. Advertising is everything they say.
SamFromUtah
Maybe there have been funding cuts to the FEC – seems to me that’d be one of the first parts of the government that the drowners-in-bathtubs would want to drown.
evie
Thank you. I said the same thing on Ben’s site. This isn’t brain surgery, here. Load it into a database and be done with it. But, apparently for our government (and some in the media) large amounts of data is some sort of insurmountable problem.
Dennis - SGMM
Makes you wonder what they’re doing with the terabytes of shit that NSA is pulling off the net. I figure, nothing.
jake
I blame Herr Hans von Spankofsky.
PeakVT
What do you expect after 7 years of Republicans trying shrink government down to where it can be drowned in the bathtub?
Rick Taylor
Off topic, but via Kagro X at Dailykos, it appears the ability of congress to use subpoenas for oversight is pretty much dead.
Rove is saying the courts will have to resolve his subpoena and
It almost makes me wish Clinton would win, just so we could get to see her making full use of all those shiny new executive powers the Bush administration has developed over the last seven years.
merrinc
What kind of computer illterate brain dead morons would use spreadsheets to track and analyze massive quantities of financial data? For crying out loud.
And this:
Um, no.
Soylent Green
Federal offices seldom upgrade when the private sector does. Mine is still running NT 2000. To get an application you need, you put in a request for technical approval. Then wait two years. Then learn that there’s no funding.
liberal
John Cole wrote, The FEC can’t afford a copy of Excel 2007?
Actually, the real question should be, “they can’t afford a copy of PostgreSQL, which is free?”
demimondian
It’s more complicated than that, and has to do with the dreaded “file format” issue.
(Warning: major geekery follows. I’ll try to keep this accessible to humans, and if I fail, I hope that others will translate.)
Back in 1995, Microsoft updated all of its Office file formats. That file format update caused major dislocation and, in Microsoft parlance “customer pushback” — big buyers screamed. Microsoft had foreseen this (Office had started taking off in the previous two releases), and had made the file format forward-extensible, and so Microsoft simply stopped changing the file format.
Great. Good engineering, etc., etc., etc. Except there were an unintended consequences. Like all file formats, xls — the name for the Excel file format used in versions up to 95 — reflected the realities of the time, and one of those was that on a machine with 64K (or even 256K) of RAM, you didn’t have 64K rows in a spreadsheet.
Roll forward 8 years, to 2003, when O2K3 is released. 64K rows in a table is not unheard of, but you’d use the right tool to represent such a table: it’s called a “database”. A spreadsheet really isn’t the right metaphor. More than that, you wouldn’t ship that table around by the connectivity widely available at the time; remember that broadband penetration in the US is very limited in 2003. (Yes, that’s only 5 years ago. So?)
In 2003, then, the FEC has to make some technology decisions looking forward to broadband dissemination to libraries. The 64K row limit is known and understood, but ignored — reasonable engineering. This is before Kerry’s 2004 New York Times advertisement, after all; the fund raising potential of the web is completely unrealized, and we’re still thinking about bundlers…maybe.
Looking at those constraints, they make a guess, and pick XLS 95/2000/2003. Looking back from the vantage point of five years later, that might seem stupid, but…it wasn’t at the time. (And, FWIW, the history of the raising of the row limit is actually funny — maybe pb or one of the linux-philes will fill everybody in on ODF, IBM, and the importance of Mitt Romney’s presidential aspirations in that decision.)
liberal
…or purchase a more powerful database application, such as Microsoft Access or Microsoft Excel 2007
Excel is a spreadsheet, not a database.
Access is a DB, but it’s pretty fragile because it doesn’t follow a client/server paradigm. Would never use it in anything like an enterprise environment. Not sure why anyone would use it when something like PostgreSQL is available for free.
DR
Excel is not a database program, and Access is a piece of ****. I agree with liberal: PostgreSQL is a solid database server, and it’s free (it’s started displacing Oracle in certain areas, including Skype’s entire operations…). There are a number of free applications they could use in addition to PostgreSQL to make the whole thing work flawlessly.
But frankly, Excel is NOWHERE NEAR adequate for that kind of work, and the person who decided to use it should be fired for sheer incompetence.
DougJ
This is bad news for Democrats.
liberal
demimondian wrote, Looking at those constraints, they make a guess, and pick XLS 95/2000/2003. Looking back from the vantage point of five years later, that might seem stupid, but…it wasn’t at the time.
Actually, it was. What’s wrong with something rather more generic like *.csv? You can import that into an RDMS without having to be all that clever. No limits on size in principle.
That government requires proprietary formats like *.doc and *.xls is crazy.
Krista
I’ve never even heard of this program. We use Access for our membership database, but it’s because we’re a start-up non-profit, and can’t afford stuff like Raiser’s Edge. I’ll have to check that PostgreSQL out…
Bey
Microsoft Access???
It is to weep.
“Hey guys, let’s build an Access database for xxx!” has been the bane of my professional life. 3 years later it’s corrupt as all get out, the original brainiac who put it together is long gone, but now it’s integrated into all their business practices.
mimaqueen
SQL Server Database made by Microsoft.
Data goes into a database. ACCESS cannot handle a million records.
Have heard or Oracle or DB2?
demimondian
Which version of .csv? Although you would think that “comma separated value” would be a well-defined format, you’d be…wrong. What about strings? How do you handle names with Unicode characters in them? Did you even realize that there are names with Unicode characters in them? How do you handle donations made in foreign currencies and recorded in USD equivalents? How do you handle embedded commas? What about embedded quotation marks?
XLS is actually a much better choice than CSV. It’s got a well-defined, gold standard parser (Excel 2003). That’s not true for csv — if you’re interested in a publicly accessible “parser” for csv, go look at the Python 2.4 csv module. It’s huge — and, by van Rossum’s own admission, doesn’t entirely work. Still, it’s as good a parser as there is in the public domain.
Dennis - SGMM
Heh. “It just needs some touching up. I’m sure it won’t take you more than a couple of hours…”
Bey
STOP SAYING THAT!!!! AIEEEEEEE~~~~ /dies
demimondian
You people who are advocating using a database are missing the point. Of course you want to analyze this using a database — this isn’t an analysis problem. It’s a data transmission and representation problem. The FEC has a single table to send around. That is a situation which is best solved by sending around a set of xls files currently, and asking people to glue them together side by side.
Yes, the problem could be solved by sending around a gzipped XML file with a suitable schema or DTD, or, worse, a csv, but those are much harder for normal people to read. The problem could be better solved by sending out xlsx files and xls files side by side, or even odf and xls files side by side. But we’re not talking about processing, we’re talking about dissemination.
ThatLeftTurnInABQ
demi nails it.
This is the legacy of a circa 2000-2003 decision regarding technology that was not terribly unreasonable at the time (note: I deal professionally with some branches of the Federal Govt. which you would expect to be way more advanced than the FEC from an IT standpoint, and you guys would be shocked to know how common .xls files are, used as psuedo-databases and for all sorts of other purposes for which they are not the best tool).
The real news here is not the technology, it is how shockingly narrow the base was upon which the nuts and bolts of our political system ran. If this is the first campaign where this 65k limit has become a major issue, then in all previous contests less than 65k of Americans (i.e. about 2 tenths of 1 percent of our current population) controlled the flow of money which is the lifeblood of politics.
Think about it for a minute.
The key decisions, such as “can candidate X continue to run, or do they have to drop out?”, and more basically: “who has the resources to win a close election?”, these decisions were being made by at or less than 2 tenths of 1 percent of the population.
Is that really a democracy?
If the change in the funding model which has occurred during the Democratic nomination contest this year becomes permanent, and spreads downwards into Congressional races and to state and local politics, that may be the single biggest thing to happen in US politics (for which this race will be long remembered) in a very long time.
Apsaras
Federal Government employee checking in, and our ever helpful BearingPoint private contractors solution to anything and everything seems to be “Build another damn Access database!”
I do a lot of my financial work in an access database, but it’s not terribly complicated and therefore has only crashed a few times on us. Still, it’s a hassle and slow as shit.
Dave_Violence
This is a bunch of bullshit. Fake, fake, fake non-story. Who uses Lotus 1-2-3 any more? …and who would bother with Excel for this kind of work?
Yeah, yeah, yeah, this is a fake story designed to get all of us “geeks” in a tizzy. The real question is: so, the FEC didn’t contract this work out to a data processing firm?
Move along…
demimondian
TheLeftTurnInABQ nails it — it’s why I keep coming back to Kerry’s NYT ad. I think that ad, along with the fund-raising success of MoveOn.org, changed American politics forever. What is public funding, if you can gin up a political advertisement at low cost and target an audience *and successfully raise money*? That’s always been the barrier that kept the hoi polloi out of politics.
Now, it may well be that the price of campaigning will simply rise accordingly. We’ll see — but in the interim, we’re in for an interesting ride.
JGabriel
ThatLeftTurnInABQ:
GOP Reaction:
ThymeZone
From a technical point of view, merrinc at the 11:43 timestamp gets it exactly right.
Of course, if you are trying to get a PhD in Acronyms that geeks toss around to look smart and cool, you can go with demi’s post.
But anyway, if you want to do database type work, you need a real database engine and real tools to work with it and administer it.
We could argue forever about those, but one thing we should be able to agree on is that Access and Excel are not real database platforms. They are to real database engines what the Monopoly(tm) race car is to the real Indianapolis 500. Just toys, and way out of date.
Most importantly, the government’s problem is not so much about being able to choose tools. It’s about the crazy and dysfunctional way we make government do budgeting and funding, the insane “fiscal year” model of financing programs and initiatives that take years to plan, and manage properly, and the politicization of these processes.
It’s a surefire way to end up with things that cost way more than they are worth, while other things that are needed go by the wayside.
The Other Steve
Using Excel = A business problem… accountants and such have no problem.
Using Access means you’ve moved into the realm of hiring an expert. Now you need some thought into what you are building, and an IT resource to put it together. You need to work out methods to import data, etc.
Using SQL server or something else… Well now not only do you have data design, but you’ve got to build a front end application to access the data.
Excel is used all over the place, not just government, but in business as well. To move from Excel to something else, which happens when you’ve proven your concept and grown to need more capacity is a big deal, as it involves hiring a new resource or contracting it out, etc. You’re talking probably a minimum $100,000 decision because even if you contract out the construction you need to spend time writing up what you need.
Still, one would think the FEC would be in a position to pull this into a database, and then provide a UI which allows people to query it and download the results.
I did stuff like this for the mortgage company I worked for, with similar amounts of data. We had a $1 million annual budget to maintain our database.
It’s a bit more than $229 for a copy of Office.
Davebo
Sure it can. Just not very efficiently and it will corrupt easily.
SQL Server is nice, but why waste the grand when you have free alternatives?
Stored Procedures are nice, but not something I can’t live without.
DR
XLS better than CSV?? Don’t think so. Problem is:
1. XLS is a closed, proprietary format. There are no proper parsers save Microsoft’s own (the others are at least partially reverse engineered, since M$ refuses to publish the entire spec).
2. CSV is easily specified to ensure validity, at least as much so as XLS is.
Simple fact is that XLS is most often consumed as is, within Excel. Rarely is it parsed by other applications and imported into a real system. There are scores of successful CSV-based export/import systems out there, even if they are not perfect.
My view would be that XML would be preferable (even if it is extremely verbose). Even X12 would be. But XLS is NOT, for the simple fact that, get this, NOT EVERYONE USES WINDOWS. The vast majority of real server systems are NOT Windows-based, especially in the corporate or governmental data center, where Unix and Linux dominates.
Excel is the Devil’s Toy :-)
The Other Steve
I think ThatLeftTurnInABQ has it right, that this is really amazing now we’re talking about several million people giving money.
RSA
Demi’s observations (which look right to me) aside, I’m not sure that the problem is accurately stated in this fragment:
The rest of the article actually doesn’t mention any problems internal to the FEC in processing the data; it’s all about making the data accessible to the public in a format that older applications can use.
My uninformed guess is that the FEC does have reasonable software running internally, but that some script had some parameters based on faulty assumptions about the data built into it–parameters that could be changed easily enough with a bit of tuning within 12 hours. Of course, I could be wrong, as demi and ThatLeftTurnInABQ could tell me.
ThymeZone
TOS is on the right track too. In my environment we collect and disburse $2m a month in amounts averaging around $30, in transactions that have to be accounted for to the penny over periods of years. To a portfolio of business rules that would fill a couple of NYC phone directories in small type.
The database engine we use is $50k a year just for the ongoing license and support. I am not giving you the whole story for reasons of anonymity, but those are pertinent facts.
But anyway, expect more of this kind of crazy in the years ahead. When govt agencies get starved for money, their semifunctional ways become completely dysfunctional and desperate.
Of course, when they are too fat with money, they become porcine and corrupt. So, there you are.
ThymeZone
Before we devolve into a full on geek flame war, here’s one thing we can agree on.
And if Satan plays with Excel, he owns the distribution rights to Access.
The Other Steve
Unix hasn’t dominated for quite some time. Most Oracle installations today are on Windows, not Unix.
The point demi was making is that XLS is a no-brainer. You just send the file. CSV you have to send the file with another file which describes the contents of the first file so you can import it correctly.
Dennis - SGMM
Look for Congress to legislate any number of barriers and limits to online funding of candidates. Even though the Supremes famously decided that speech is money, Congress will devise legislation to make certain that those individuals and corporations with the most money will still have the most speech. To do otherwise would mean that our Congress critters would need to develop a compelling message and actually support their constituents.Our proud system of corporate welfare would not survive the loss of legislative influence by the banks, big pharma, the telcos, etc.
The Other Steve
Don’t worry. Eventually they’ll upgrade to Excel 2007 and find that they can make the semifunctional ways last a little bit longer.
demimondian
Sorry, DR, but that just isn’t true — and my employer runs more Linux servers than any other organization in the world. Windows owns the server space, and has for three to four years. Unix lost, and is losing ground pervasively. There are a number of good reasons for that, but the simplest is that the boundary between client and server blurred and faded out, to the point that the little laptop in front of me can and has supported significant loads (by the standards of any organization except my employer, a weather bureau, or a weapons lab) on it.
ThymeZone
Yes. Actually, almost any old flat file format that is comprehensive and where there are agreed on rules between supplier and consumer is just fine.
I watch young propellerheads today strugling with parsing problems that we solved in old formats 30 years ago. And we can work circles around these guys. It’s much faster when you are using wheels long ago invented and proved out.
What’s the matter with kids today? (Bye Bye Birdie). I mean, they are just lazy. They don’t want to write for parsing problems, they want some fancy tool to do it all for them. I can write the parsing gadgets faster than they can shop for the tool.
Hey, you might not agree with me but who else is providing musical entertainment on this thread? Huh?
The Other Steve
Not Congress. Republicans.
They absolutely hate it now that the Democrats have figured out a way to outraise them. So they’ve suddenly become huge champions of “campaign finance reform”.
I remember back in 2000, GW Bush’s website had a database tool where you could look up people and how much they gave. It was interesting, and I never figured out where they got these numbers, but they had a LOT of people who had given 2 or 3 dollars. Even at that time, they were trying to rig their numbers so it looked like they had more small dollar donors.
I never understood the 2 or 3 dollar donations though. I could see $5, or $10 or $20. I’ve tossed that much before into a hat at a political rally.
Just seemed weird.
demimondian
For those of you not in on the joke, the answer to the question in merrinc’s post at 11:43:
is “most banks, actually”. Excel evolved to serve a particular clientele: financial analysts. It then spread to loan originators and officers, and then, having metastasized, to the rest of the financial industry.
We’ll leave it as an exercise to the reader to determine if TZ was playing along with the joke, or as ill-informed as he seemed.
Chuck Butcher
If you find the percentage of contributers shocking, DPO which is one of the more active State Parties finds a 1% of registered Democrats active in a County Party a wonderful thing which typically only happens in the small population counties. The political organization with the greatest reach into politicians can’t get 0.5% involved.
ThatLeftTurnInABQ
RSA,
That would be my guess too, speculating on the basis of the limited info in the article and my experience with other Govt agencies. It is a very common pattern IMHO to have Oracle apps or something similar running as a backend, but to use Excel to communicate with the outside world because it has become the lingua franca of business and govt.
The lag time on these technology choices is about 10 years, especially when you are dealing on a lowest common denominator basis with non-profits, small businesses or other small organizations which you need to exchange data with, who can’t afford to upgrade their IT to keep up with changing fashions. That is why XML or other more modern formats (than csv or xls) are not yet in widespread use.
Also, The Other Steve has it exactly right re: these IT decisions. You are implicitly investing (or not) in a support infrastructure when making these decisions, and upgrading to a superior database technology without the necessary level of support (which is a very expensive long term commitment) can be a disaster.
I have extensive experience with one Fed. site running a 100+ million dollar Oracle apps ERP suite but with grossly inadequate programmer-analyst staffing. whose system has turned into a very expensive black hole. Data goes in, but no information comes back out, because they don’t have enough DBA’s and other folks who understand how to get info back out of their system, so they can’t answer even very simple questions like “how much of product X did we purchase last year?” without massive outside assistance.
slag
This “story” was just headline-grabbing rubbish.
As regards the PostgreSQL, MySQL, other free solutions: They’re free to start up but require some specialized knowledge/training to create and maintain decent databases. As a consequence, the long-term costs may outweigh their initial free-ness.
While Excel is a spreadsheet application and not a database, it’s probably the cheapest solution.
If this story were actually worth anything at all, it would have addressed the larger issue of technology and government and how the more operationally-focused aspects of the government often get the budgetary shaft because people see administrative costs as waste rather than as having an actual purpose. Government needs to be transparent and accountable, and technology plays an important role in making that happen.
Obama’s understanding of the importance of technology and the actions he has undertaken as a Senator to make government more transparent are some of the major reasons I like him so much.
That said, this Politico story was really, really stupid.
ThymeZone
Sure, but spreadsheets have a rather narrow purpose band. YOu can “plug in” and play with numbers. But to use one as a container for large quantities of data, or for manipulation of large quantities of data, is an exercise in foolishness and frustration.
I have problems that makes me process 12 million rows at a time. Excel, or Oracle or Db2 or Informix? And keep the 12m rows online and available to 3000 impatient users at the same time. And provide as near bulletproof redundancy and availability as possible, while doing it. Excel? Really?
You take the Excel, I’ll take the real database. And as usual, thanks so much for your great advice.
demimondian
Yup. And you still haven’t explained how to handle names which aren’t written in the bottom 127 characters.
When you can sort pinyan — when you even know what pinyan is and tell me how to sort it (hint: that’s a trick question) — you can come back and try to help fix the problems you caused. Until then, go back and play in your room, child, and let the grown ups clean up the mess you made before we sent you away.
The Other Steve
For those who didn’t know… The entire sub-prime mortgage business was run off excel. Loans were bought and sold and traded in lists contained within excel spreadsheets.
It’s a very popular and commonplace tool in business.
I’m not saying it’s wise, or a great idea, but it is common place, and there’s a huge learning curve to move to something different.
IamTheJudge
First post
We are missing the point, from the original article:
The FEC is NOT the one with problems to store/process/analyze the data, instead as several posters mentioned is a problem for casual or professional analyst that relied on Excel 2003 to do the number crunching, that given the number of rows required due to BHO and HRC extensive fund raising it simple can’t store it in one single spreadsheet.
Many people still uses Excel 2003 b/c they are very proficient or b/c they have extensive specific programing (macros, formulas) that works well for them and don’t want to go thought the hassle of upgrading and debugging.
ThymeZone
Well, you have gone off the rail much earlier than you usually do in these exchanges, demi.
First of all, did you mean PinYin?
I hope you didn’t mean Pinyan. But, you might have. Heh.
Anyway, no I don’t sort PinYin, I work in Arizona with data collected from Arizona financial transactions.
Hopefully, I can retire before the PinYin crisis hits my shop.
You really are a hoot, dood.
demimondian
Right. You do that, being as you are insulated from your own mistakes by a vast budget that is effectively unauditable.
I’ll use Excel, and push a single technology to its limits, and frequently discover that those limits are much less restrictive than you think. Macros don’t work? Fine, I’ll write plug ins. VB isn’t fast enough? Fine — C# over the CLR is. Need to push stuff to a remote back end? Great, that’s what Excel server is for.
So you do kewl stuff with tables and normalization. I’ll get rid of you and replace you with three college students.
ThymeZone
Demi, I am sending this along to you because, as you know, ignorance of the law is no excuse.
ThymeZone
Um, if you only knew how wrong you are about that, you’d be …. in denial. Or completely humiliated.
But as luck would have it, I cannot enlighten you lest I pull back the thin cover of anonymity I have left.
Let’s just say, I could end up taking a pay cut before this year is out, but it’s better than losing my job.
ThymeZone
A subtle, but necessary, improvement ….
ThymeZone
Lecher.
RSA
Thanks for the info, ThatLeftTurnInABQ.
On data formats, my last name is recorded in three different ways in various state and federal databases (judging from my SS card, passport, and driver’s license). I wonder if they know I’m just one person?
liberal
slag wrote, As regards the PostgreSQL, MySQL, other free solutions: They’re free to start up but require some specialized knowledge/training to create and maintain decent databases. As a consequence, the long-term costs may outweigh their initial free-ness.
Huh?
You mean, sticking data in Excel spreadsheets, and then having it turn out that you have a major disaster on your hands because Excel isn’t an RDMS and hence lacks proper integrity constraints doesn’t entail a long-term cost?
If you think that someone with little training can properly maintain a commercial solution or even properly design a database…
Krista
Yep. And that’s the problem with a lot of the database programs out there.
I work for a small non-profit with one other person. I’m the computer “expert” of the two of us, and my skills are seriously lacking. We have over a thousand members, over 100 volunteers, and we do a major fundraising campaign every August. We use Access and Excel for everything.
Is it the best? Probably not. But the free database software isn’t intuitive enough for dummies like me, and the software that IS intuitive and designed for dummies (Raiser’s Edge, DonorPerfect) has a four-figure price tag.
So Access has its place. It’s not the best, but in a pinch, it’s better than nothing.
demimondian
No, and that’s by design. There are, however, companies that make a tidy profit from “merge-purge” to determine that those three records correspond to a single individual.
demimondian
I would challenge that. If it has its place, then it is best *in its place*.
Don’t let the snobs get you down — they don’t know what they’re talking about. Tools are tools, and different tools are useful in different places. None of the snobs preaching about DB2 or the like could use any of the database I use every day — because my team wrote it to answer one particular kind of query extremely fast. In its domain, it’s the best there is. Outside of that domain? It sucks.
Access is the same way, and don’t let anybody tell you anything else.
demimondian
According to the rules, I think I’m supposed to say “And they’ll still be paying you more than you’re worth,” right? I don’t have the heart for it.
Sorry, I’ve taken pay cuts, and they suck. Sorry, dude.
KRK
The discussion is beyond me, so I’ll focus on the title of your post.
Boy did I love me some TRS-80 back in the day. My mom was a teacher in a very small school district and I was an acknowledged smart kid so I got to keep one of the school’s TRS-80s at home over summer and other holidays. I remember my running tally of where on the cassette tape counter various programs would start, though I can’t recall now what those programs might have been.
My vast knowledge amassed on the TRS-80 led me to a week-long computing camp in 1982 where I was just a little bit younger than the high school crowd there. The ratio of boys to girls was 7:1, I first heard Soft Cell’s “Tainted Love,” I first saw (and envied) girls wearing Levi’s 501s, and the organizers (who were they?) showed a movie one night wherein I learned what the Mann Act was. Good, good times.
libarbarian
Its Clintons fault
El Cid
I bet the free, downloadable “OpenOffice” could handle it, and be compatible with Microsoft Excel 2007.
I did buy MS Office 2007, and it’s real fun, because every large spreadsheet with images which gets large, say 6MB or so, crashes and removes all images unless you’ve saved it in the older file format of Excel 97-2000 or whatever.
demimondian
You’d lose your bet. It’s a file format limitation, not an operational limitation.
ThymeZone
Luckily, that issue is not on the table here :)
As it turns out, my employer is in no position to figure out what anyone is actually worth. The cheeses are too busy covering their own asses all the time.
ThymeZone
Yes, true, although I have to say, “better than nothing” also describes an icepick when it’s all you have and your car won’t start.
But anyway, Access’ place is on the desktop. If it’s much bigger than a desktop solution you need, beware.
Or you can link Access tables to a real database engine like SQL Server and sort of have a hybrid thingie that will work for small groups.
Sorta kinda. There are data type issues. If you don’t know what that means, it just means that you might find yourself trying to drag square pegs into round holes.
Rome Again
Hogwartz first. You are much more important.
Sleeper
How far along are you on Battlestar, by the way?
bago
Are you high? Xml is designed to represent structured data, while a database is designed for Relational data.
And anyone trying to load a modern db into a 32 bit OS for industrial queries is completely bonkers. We’ve been building out terabyte + databases for at least a decade.
bago
And Seriosuly, CSV? You want to run string parsing ops on a terabyte of data for anything not cached in RAM? Yeah, let me run non-indexed data at o(n) with a string.split over columns width. A non indexed O(n) x C op for every query.
Smokin some good shit.
uh_clem
While it may sound absurd that the FEC can’t just buy a copy of the latest version of M$ Office, anybody who’s ever worked with large organizations that depend on standards and interoperability know that it’s no trivial task to change a standard.
Sure, you or I can easily install a new suite of tools on our PC, but multiply that by a thousand employees and the problem is not so small. The cost of the software is usually small compared to the internal IT costs of updating everybody’s machine and swatting all the inevitable anomalies that crop up.
Which is to say that large organizations tend to move slowly. Did you know that NPR’s satellite automation ran on OS/2 until about a year ago? (and that it’s still being supported for backward compatibility for stations who haven’t upgraded to the new system yet.) The fact that the FEC is using a 5 year old file format is not really very remarkable.
Oh, one more thing while I’m wearing my systems analyst hat: you don’t scrap an entire system just because you find one or two exceptional cases. Hillary and Obama have more data than can fit in a single file – that’s two exceptional cases. Since there’s a fairly straightforward work-around (split the data into multiple files), doing that makes more sense than a crash-course in software upgrades.
liberal
bago wrote, And Seriosuly, CSV?
That’s not the point. That particular question concerned what spreadsheet format to use, given the assumption that people submitting to the FEC would be sending in _some_ kind of spreadsheet.
Of course, once it gets to the FEC, it should be loaded into a real RDMS, where needless to say it would not be stored as CSV.
I hear you on the XML, BTW.
bago
CSV is almost as old as punchcards. Seriously.
liberal
bago wrote, CSV is almost as old as punchcards. Seriously.
Huh? The C programming language is also pretty old. So is the concept of “subroutine”. Does that mean no one should use C or function calls?
Not to mention that XML is far newer than the relational model (although of course it has antecedents that date to the time of the relational model). Does that mean an XML “database” is superior to a relational DB?
The advantage of CSV is that it’s extremely generic.
IIRC I was dumping data from an Access table—BTW, one which had nicely become corrupted, with the rows broken, through some Access miracle—for import into PostgreSQL and the simplest, cleanest solution for exporting and then importing was CSV. Worked like a charm.