Building a Linux NAS box

All discussions for image processing hardware

Moderators: gcrogers, Kevgermany, Costas L

Building a Linux NAS box

Postby john frum » Sat Nov 03, 2007 12:37 pm

I have never managed to stump up the enthusiasm to do this, although I have been considering it for ages.

I have an old Dell 1Ghz Intel box with an expired (corporate) XP system on it - long retired. It is memory-poor (256 or 512, I have forgotten) and uses a briefly available odd memory format which costs an arm and a leg. I'm assuming that as a NAS storage device this isn't an issue. I was thinking of adding a RAID controller and Linux, to create a secure, capacious backup device.

Anybody out there done this and can comment on the difficulty involved? I worked in IT for years, in an absurdly specialised field so I am not unfamiliar with the kind of headaches I'm liable to encounter but lack any experience with Unix apart from (briefly) Netware (?) Effectively I know nuffink about Linux.

All advice appreciated.
john frum
 
Posts: 228
Joined: Tue Dec 05, 2006 12:46 pm
Location: S. England

Postby oldguy » Sat Nov 03, 2007 1:16 pm

I do not understand your question. Are you looking for additional storage, the modern day equivlent to a tape backup, or another computer yo use in an emergency?

For the first two, what's wrong with a hard drive in a box with USB cord? Super cheap, huge capacity. Buy two, they both won't fail at the same time? You can just buy the box, and put your old hard drive in it, works great.
oldguy
 
Posts: 220
Joined: Sat Jun 24, 2006 5:46 am

Postby Kevgermany » Sat Nov 03, 2007 2:00 pm

Oldguy - NAS = Network Attached Storage. A pc cabled over the lan that does nothing but provide disk space to the other machines. Once configured there's no keyboard/screen on or needed.
John, some thoughts

Power supply may need an upgrade to cope with plenty of hard drives.
Old case may not take a new power supply.
Old case may not take the extra disks (assuming Raid 5, it's 3 or more).
Memory upgrades may be difficult/expensive and the limited memory could be a bottleneck.
May have linux incompatible components. Could check this by burning an Ubuntu DVD and bringing the machine up from the DVD (assuming you have a DVD drive or can borrow one and reconfigure the bios to boot from it.). Knoppix is another distro that'll boot from DVD, possibly even CD.

Performance is likely to be slow, but not too bad as the raid card will do most of the processing. However the bus & processor on the old box is going to be a bottleneck.

May be cheaper to get a new case, twin processor raid motherboard and start from there. These should have gigabit ethernet on board and if youshop around, you may find one with an external S-ATA port (S-ATAe) which'll aid things in future. There's a lot of info on Steve's hardware on these things, including an article comparing raid motherboards (and they achieved some high speeds!). The killer in your proposal is the cost of a decent raid card, compounded by the possible linux compatibility issues. btw, an OEM copy of XP home, which you'd qualify for, isn't that high.

Another portable, straightforward, but slow option is an external USB drive. 400Gb is now cheap if you shop around and for copying or backups they're great. I wouldn't like to use one for a swap file, though.

My personal opinion is that NAS is an expensive over engineered solution to multiple USB drives.
Kev

Man is limited by his fears, not by his imagination.
Kevgermany
Site Admin
 
Posts: 5269
Joined: Fri Feb 25, 2005 9:11 pm
Location: near Munich, Germany

Postby john frum » Sat Nov 03, 2007 4:17 pm

Kev, thanks.
I was definitely under the impression that as a simple file server the performance issue (processor, RAM, bus) wasn't er, an issue. I certainly recall a few articles that tended to suggest as much.

The problem with external drives - internal for that matter - is that everything needs doubling up if it's the primary location for storage. I had an early NAS box at work as a private backup device which incorporated (as best I recall) embedded Linux - and there was no performance problem. Also I'd assumed that cost aside, I could stuff a p/s up to 700 w + into this case? With no graphics card etc this would surely run a 3/4 drive RAID array?

I just found, at £13.86 inc, the following (on the DABS site). My old box only has PCI slots available. There are some useful customer reviews there. It doesn't appear to enable RAID 5 but this is contradicted in some of the reviews (read quickly).
PCI SATA Host Controller Card
4 Internal SATA port RAID 0, 1, 0+1 (Optional)
Specification
PCI 32-bit 33/66MHz interface Compliant with PCI specification Revision 2.3 Support PCI bus-master access Compliant with SATA 1.5Gbps(150MB/s) Complaint with SATA 1.0 specification Support 4 independent SATA ports 256 byte FIFO per port for fast Read/Write Operation RAID 0, 1, 0+1 (Optional) Support ATAPI devices: CD-ROM, DVD-ROM etc
System Requirement
PC computer with PCI slot Windows 98SE, ME, NT4, 2000, XP, Linux & Netware


My intention was that if the primary destination for the files was the local drive (RAID mirror) and that if files were periodically transferred to the NAS device by a command-line script, then performance limitations would be irrelevant; it scarcely matters how long the files take to copy across the LAN in the background. I had no intention of using it "live". But I take your point(s). I hadn't actually costed this out yet. Part of the attraction, assuming that's the right word, was that it would force me to build a Linux machine and learn something about it. I downloaded a Knoppix image a while ago but haven't yet tried to burn it (CD sized release - can't remember if the old box has a DVD drive).

The other route is to bite the bullet, build a multi-core Vista PC (which also forces me to take a look at Vista and thereby keep my hand in) with a large RAID array. Then gradually migrate the applications. This way I have a spare machine and plenty of storage space.

Given that I currently have little income, I'm trying not to spend money unjustifiably!
PS "Steve's hardware"? Not sure what this refers to...
john frum
 
Posts: 228
Joined: Tue Dec 05, 2006 12:46 pm
Location: S. England

Postby Kevgermany » Sat Nov 03, 2007 4:57 pm

When I lived in the UK, I found Dabs rather expensive. Don't know who to suggest in England, but I buy all my stuff online through the cheaper hardware specialists such as computer universe (English site available) or Litec in Munich (german only). In England I used to buy my components through a colleague who had a trade account with a component supplier.

That's a very cheap raid card. Most worthwhile ones come in more expensive than motherboards. Perhaps it's a clearance item, there can't be anyone still making PCI raid cards. But it'd work for you. Card also lacks Raid 5, which for me would be preferable to mirroring, especially as you grow the array to more drives/space (your doubling argument applies).

Power supply you describe would be overkill for 4 disks. Just check the physical dimensions.

Agree about the performance - not important given your described use.

I'd avoid Vista at the moment, too many compatibility issues - probably worse than Linux. However if you build new, getting 'Vista compliant' hardware is a good move, evne if you run XP, which'd be my windows choice. Having said that, if you're careful with hardware, Linux would be my first choice.

Steve's hardware is an error! :oops:
I mixed up Steve's digicams and Tom's Hardware Guide..... Sorry.
www.tomshardware.com/
Lots of how to articles and hardware reviews. They're rather biased towards WIntel, but there are AMD reviews and Linux. Some time last year I remember reading an article there on using an old machine as a linux NAS device, but I decided that for the cost I may as well have a new workstation with lots of storage.
Kev

Man is limited by his fears, not by his imagination.
Kevgermany
Site Admin
 
Posts: 5269
Joined: Fri Feb 25, 2005 9:11 pm
Location: near Munich, Germany

Postby john frum » Sat Nov 03, 2007 5:47 pm

Thanks again Kev
I thought I wasn't a million miles off the beam! Just digging about in the bewilderingly wide world of Linux user groups I have come across the comment that software raid can be configured within the Linux os - so no card essential - assuming I use IDE drives that are all this old pc is configured for.
You're probably right though. A couple of external 500Gb drives functioning as mirrored probably involves the least suffering!
john frum
 
Posts: 228
Joined: Tue Dec 05, 2006 12:46 pm
Location: S. England

Postby cem » Sun Nov 04, 2007 10:27 am

Hi John,

I have built and used a NAS based on Linux a few times before for similar purposes like you stated in your OP.
I still have a Ubuntu file and print servr for my home network.
It is very doable. The performance is better comapred to using an XP Pro as the server. On a 1Gb network, I achieve an average of 40 MB in transfer speeds. So go for it.

Or buy a ready made NAS box that takes 2 drives, which you can configure as raid 0 or 1. Some of them are great such as:

Dlink DNS-323
http://www.smallnetbuilder.com/content/view/29671/75/

QNAP TS-101 & TS-201
http://www.hothardware.com/articles/QNAP%5FTS101%5Fand%5FTS201%5FNAS%5FServers1/

Synology DS-107
http://www.bjorn3d.com/read.php?cID=1077

Synology DS-207
http://www.techpowerup.com/reviews/Synology/DS207
http://www.smallnetbuilder.com/content/view/29957/75/
http://www.amug.org/amug-web/html/amug/reviews/articles/sansdigital/ds207/

Thecus N1200
http://www.hexus.net/content/item.php?item=8291

Thecus N2100
http://www.xbitlabs.com/articles/storage/display/thecus-n2100_3.html
http://www.hexus.net/content/item.php?item=4342

DLink and Synology (and maybe some others too) run on Linux, you can hack them using telnet and add some more server or firewall functionality if you wish.

Sorry, got to run now. Ask any questions for details if you want.

Cheers,

Cem
cem
 
Posts: 758
Joined: Sat Feb 26, 2005 8:26 am
Location: The Netherlands

Postby Costas L » Sun Nov 04, 2007 11:04 am

john frum wrote:...... as a simple file server the performance issue (processor, RAM, bus) wasn't er, an issue. I certainly recall a few articles that tended to suggest as much.

The problem with external drives - internal for that matter - is that everything needs doubling up if it's the primary location for storage. I had an early NAS box at work as a private backup device which incorporated (as best I recall) embedded Linux - and there was no performance problem. Also I'd assumed that cost aside, I could stuff a p/s up to 700 w + into this case? With no graphics card etc this would surely run a 3/4 drive RAID array?

I just found, at £13.86 inc, the following (on the DABS site). My old box only has PCI slots available. There are some useful customer reviews there. It doesn't appear to enable RAID 5 but this is contradicted in some of the reviews (read quickly).
PCI SATA Host Controller Card
4 Internal SATA port RAID 0, 1, 0+1 (Optional).......



Hello John

What your suggesting is feasible for your old machine, but there are a few things you need to watch out for.

Firstly, Dell use proprietary power connectors for their motherboards, so you cannot just slot in an off the peg power supply. Also the boxes will often only support a maximum of 2 hardrives

Having said that, there is probably enough power available to drive a couple of additional hard drives and a DVD writer (I assume it has that fitted or a CD reader) plus there should be room to convert an additional 5 inch bay from CD/DVD reader to hold a hard drive using a caddy so giving you 3 drives altogether.

Its best to use SATA hard drives as you identified because you can then take these forward to any new system you build. By the way, even new systems use PCI for things like SATA RAID cards; the PCIexpress slots are mainly used for graphics. And as kev said, you might want to update the operating system to XP although Linux is very much a possibility, especially since you can do that for free. Check out the article below for a practical step by step approach
http://www.bit-tech.net/bits/2007/06/05 ... n_server/1


Essentially, buying a pair of 500GByte SATA drives and a SATA controller card to experiment with is not a dead end since you can move these forward to a new Vista system later. Even a copy of XP could be transferred to your new machine to give you a dual boot XP / Vista system.

Have fun :D
Costas
"How could I have been so mistaken as to trust the experts" John F Kennedy 1962
Costas L
Site Admin
 
Posts: 3000
Joined: Sat Feb 26, 2005 10:55 am
Location: UK

Postby Kevgermany » Sun Nov 04, 2007 12:58 pm

Kev

Man is limited by his fears, not by his imagination.
Kevgermany
Site Admin
 
Posts: 5269
Joined: Fri Feb 25, 2005 9:11 pm
Location: near Munich, Germany

Postby Kevgermany » Mon Nov 05, 2007 10:43 pm

John, you may find this interesting:

http://www.tomshardware.co.uk/FreeNAS-N ... 628-4.html
Kev

Man is limited by his fears, not by his imagination.
Kevgermany
Site Admin
 
Posts: 5269
Joined: Fri Feb 25, 2005 9:11 pm
Location: near Munich, Germany

Postby Kevgermany » Mon Nov 05, 2007 11:03 pm

A couple more
http://www.tomshardware.co.uk/diy-nas-s ... -1780.html
http://www.tomshardware.com/2004/06/25/ ... index.html

This is the one I was referring to:
http://www.tomshardware.co.uk/cheap-fas ... -1773.html

The last article is highly recommended and closely mirrors what you're thinking of. It uses an older version of Ubuntu, afik Samba is installed as standard on Ubuntu these days.

I did a little research on Raid cards, following Costa's comments. Seems that there are more PCI cards still around than I thought, everything I've read recently led me to believe that the market had moved to PCI express. Most of the newer cards have moved away from PCI, but there's still a lot left. What I also noticed is that based on comments, there are raid cards and raid cards. You need to be careful what you buy, if you're still going to go down that route.
Kev

Man is limited by his fears, not by his imagination.
Kevgermany
Site Admin
 
Posts: 5269
Joined: Fri Feb 25, 2005 9:11 pm
Location: near Munich, Germany

Postby DavidW » Sat Nov 17, 2007 4:10 pm

It's not worth doing hardware RAID over 32 bit 33MHz PCI - which is all a typical desktop system has. The bandwidth limitation of PCI becomes apparent all too soon. Workstation and server machines used PCI-X (64 bit PCI running at either 100 or 133MHz) for high bandwidth cards like RAID and gigabit networking before PCI-e came along. Typical desktop machines only have a single PCI bus, which makes things worse, particularly if you want to put a Gigabit Ethernet card on that same bus.

My main machine is an ageing dual Xeon 2.66GHz workstation. There's two PCI-X buses (one of which has a U320 SCSI controller on it, the other a Gigabit Ethernet controller on it, then both have slots) and a separate 32 bit 33MHz PCI bus, which has a Firewire controller and a slot. This helps prevent I/O bottlenecks. In fact, the real bottleneck in this system, like many Intel systems, is between the processor(s) and the memory.


The IDE and SATA ports you find on your motherboard are typically provided by the southbridge, and go nowhere near the PCI / PCI-X / PCI-e buses. This helps avoid bottlenecks.


I wouldn't bother buying a PCI-X RAID card now, unless you can get a decent one second hand at a reasonable price. PCI-X is disappearing from the latest motherboards, and within a year or two will be very rare on new hardware. You could finish up with a board that you can't easily take forward to a newer machine. You can use a PCI-X card in a PCI slot - but you'll soon hit the bandwidth window.

If you find a second hand RAID card, the battery may be dead, and they can be far from cheap to replace.


I suspect that the Dell in question is a very early P4 machine that uses RDRAM. To be honest, because of the use of expensive RDRAM, and the relative age of the machine, I wouldn't bother with it - the limitations are very apparent and spending a lot of money on a machine that is around five years old seems crazy, especially as I expect that the ultimate limit on RAM is probably somewhere around 1GB (check the manual, which will be available on support.euro.dell.com). I'd try to pick up something else more suited to the task; a newer P4 machine which will take at least 2GB of much cheaper RAM shouldn't be that expensive.

At the moment, my server is an old PIII-733 which uses conventional SDRAM - but, unfortunately, the motherboard won't take more than 512MB of RAM. It's fine for what I have been using it for, but it looks as if I'm about to outgrow it. There's more about my current position and thinking ahead here - this was maybe a better thread for it, but I'd already posted there.


Most if not all of the inexpensive RAID NASes (Thecus, Buffalo Terastation and their ilk) use software RAID, as does FreeNAS (which is based around FreeBSD; the RAID 5 module plugs into the GEOM layer between the filesystems and the disks but hasn't yet been officially accepted into the FreeBSD kernel). Software RAID 5 often works well, but can go disastrously wrong - it's not as robust as you may think. The big problem is the 'write hole', especially on a 'partial stripe write'.

RAID 5 works via a simple mathematical relationship between the various drives. If you need to update part of a stripe (for example, you swap two letters over in a text file stored in a particular stripe), you have to read the entire stripe into memory, update it, then write the updated stripe back to all the disks. That's a partial stripe write. There's more steps to a partial stripe write than writing a complete stripe - so it can be a performance issue when you make lots of small changes, as well as it having more potential for it to go wrong than a full stripe write.

There is a time window for any stripe write - partial or otherwise - in which the disks are inconsistent with each other; if you have a power failure or software crash at that instant, the stripe can't be recovered because the mathematical relationship is broken. It will always be corrupt until it is overwritten with entirely new contents and the mathematical relationship is restored.

Worse still, the overlying file system has no idea about the underlying corruption of the stripe. FreeNAS is FreeBSD based, and I believe it uses FreeBSD's UFS2 filesystem, which doesn't have a journal, which by itself can lead to filesystem corruption on an unexpected power outage. This means that you can have a corrupt filesystem on top of a corrupt RAID - double trouble, and more potential for data loss.

One particularly nasty scenario is silent data corruption; the filesystem checks out OK when you restart the system, but you have corrupt stripes mixed into your data and no way of recovering that data other than a backup from before the corruption happened.


Hardware RAID uses various performance and robustness enhancing tricks. Amongst these is RAM on the RAID controller dedicated to holding stripes that are being updated or that have been updated and are waiting to be written back to the disks - which can be battery backed, so that if the machine loses power, the stripes will be retained in that memory and written to the disks when power is available again (write ahead is not safe without battery backed memory on the RAID card). However, there's no inherent guarantee that the battery backed RAM will save you; the overlying filesystem may still have problems, and there's always the possibility of the RAID controller or its battery malfunctioning.

I believe it's vital to understand that especially the higher RAID levels are not all gain. On the face of it, RAID 5 (and, especially these days, RAID 6) are a dream solution to data integrity for our ever growing data storage needs. RAID will prevent some problems - after all, who can dispute the wonder of ripping one disk out of the server and everything staying working. However, it introduces whole new ways to lose and corrupt data, which often aren't understood well. Further, many tune these systems for performance - there are people out there who seem benchmark obsessed - not realising that they're compromising the integrity of their data.


I know someone that runs a hosting provider pretty well; I know that most of the machines he has scrapped over the years have been down to RAID related flakiness, and many of the outages are down to RAID related snafus. Increasingly he's moving to using iSCSI and blades, with the storage on expensive proprietary iSCSI servers that aims to provide high reliability. Even such expensive setups don't always save you - some outsourced servers he uses failed this week due to an air conditioning failure, which left the iSCSI system wrecked. The last I heard, they were couriering over drives to fit into local RAID controllers on the servers so that they could restore backups and get things back online.


An intriguing way out of much of this mess of RAID as it stands today is ZFS. It's still being developed, and is only seen at its most mature in Solaris and, to a large extent, OpenSolaris - it was, after all, a Sun innovation. I'm not going to go through the advantages of ZFS here, but it offers raidz and raidz2 which have no 'write hole' and mitigates against the 'expense' of partial stripe writes using variable stripe sizes (which reduces the computational burden and increases the robustness of writing to the array). ZFS also offers various data integrity and self healing capabilities. Even better is that it doesn't need a hardware RAID controller - in fact, you're better off not using such a controller with ZFS.

ZFS is in the upcoming FreeBSD 7.0, but it's experimental in 7.0. It's likely to be marked stable in 7.1, hopefully by around next summer - whether it will be possible to boot FreeBSD from ZFS by that time is uncertain. The person responsible for porting ZFS to FreeBSD has suggested that you really want a minimum of 1GB of RAM, and ideally a 64 bit machine - this isn't something that's going to run well on very old hardware.


What FreeBSD 7.0 does have is journal capabilities at the GEOM layer (which sits between the filesystems and the disk hardware). Journalling means that losing the power shouldn't corrupt the filesystem. For robustness on FreeBSD 7.0, I'd use UFS2 with gjournal on a gmirror pair (in other words, a journalled filesystem on RAID 1). In fact, it may be possible to do it the other way round - gmirrored UFS2 on a pair of gjournalled drives - I'm not sure which is more robust; I'd have to think about it.

Either way round, this may not be as 'sexy' as software RAID 5, but it is likely rather more robust and may well perform better. I would never run anything I was serious about the data on without a good quality UPS, either; all the machines here are on a UPS apart from the laptop, which effectively has a UPS by having its battery installed.


FreeBSD 7 will soon be at release candidate stage; you could maybe do worse than put a drive or two in the Dell, install 7.0-BETA2, and play. The FreeBSD Handbook is being updated to cover 7.0 as we speak - indeed, it's pretty much there now, though you may have to hunt around for instructions on things like getting gjournalled UFS2 going (I can walk you through if necessary). Really, you shouldn't use a non -RELEASE version of FreeBSD for production purposes unless you know what you're doing, but 7.0-BETA2 has already had a long shake-down period, and things are getting better for it all the time whilst upgrading isn't too awkward

The RAID 5 module FreeNAS uses may make it into FreeBSD 8, or just possibly into a 7.x release from 7.1 onwards, but the effort to get it officially included in the kernel may be decided as not worthwhile with ZFS hopefully becoming stable. There again, if you want it, there's nothing stopping you building it on a 'stock' FreeBSD machine.



David
DavidW
 
Posts: 723
Joined: Sat Oct 07, 2006 8:38 pm
Location: Bedfordshire, UK

Postby john frum » Tue Jan 01, 2008 1:32 am

Thanks everyone. I had a dead processor or motherboard in this box so it's toast - at which point I lost interest. I quickly scanned the last few posts which I need to read carefully. Since I started trying to learn panoramics my requirements for both storage and processing speed have risen steeply - and at a time when my income is at an all-time low! But I need to build something that will last for a while. Assuming I can negotiate the difficult decisions about mb & processors I need a system that incorporates some reliable RAID technology. Whether I can incorporate this into a working PC, or whether I need some kind of NAS, I'm not sure. I certainly can't face spending > £1K on the whole bundle.
Ideally I was thinking about 2 striped arrays - 1 for os, 1 for PS/Stitcher scratch files. Where do the apps then go? Then another RAID array for data storage.
Not sure all that can be done within that kind of price so I may have to compromise somewhere. DDR2 or 3? Vista or XP, 64 bit os?

We'll see.
john frum
 
Posts: 228
Joined: Tue Dec 05, 2006 12:46 pm
Location: S. England


Return to Hardware

Who is online

Users browsing this forum: No registered users and 2 guests

cron