Pages

1/30/2008

SMTP::From Address Spoofing


Sender Policy Framework
The Problem: Sender Address Forgery
Today, nearly all abusive e-mail messages carry fake sender addresses. The victims whose addresses are being abused often suffer from the consequences, because their reputation gets diminished and they have to disclaim liability for the abuse, or waste their time sorting out misdirected bounce messages.

The Solution: SPF
The Sender Policy Framework (SPF) is an open standard specifying a technical method to prevent sender address forgery. More precisely, the current version of SPF — called SPFv1 or SPF Classic — protects the envelope sender address, which is used for the delivery of messages. See the box on the right for a quick explanation of the different types of sender addresses in e-mails.

1/24/2008

Storage::Alphabet Soup


JBOD = "Just a Bunch Of Disks"
SBOD = "Switched Bunch Of Disks"

Switched = Better ;)

Article

1/17/2008

Data Center::Fire Suppression


FM200
Interesting information gleaned from overview from vendor:
- This system puts out a fire by quickly lowering the temperature of the room by 20 degrees or more. This also creates a vaccum in the room which, in addition to lower temp, puts out the fire. This change in pressure can displace ceiling tiles and stir up dust from the floor. Very shortly after the gas is deployed the room warms back up and the pressure in the room returns to normal.
- The gas is inert and not toxic to breath.
- The gas disperses sideways from a nosel that looks like a sprinkler head.
At our site this will be integrated with the same control system as our pre-action system.
So...it will work as follows:
- smoke alarm in the data center => the preaction system will release water control valve making water available to the system. The pipes remain pressurized so pipes still have only air in them until the heat from a fire causes a sprinkler to open.
- multiple smoke alarms in the data center => FM200 system will alarm, 30 second delay, the gas will be released. temp will go way down, ceiling tiles will be sucked down into the room--some will fall out, "hurricane" wind may blow more dust up from the floors, A/C system will be shut down to prevent air flow that would further feed fire. A few seconds later the room will warm up and the pressure will become normal. No cleanup procedure required. (just dust things off...)
Other interesting info from Q&A
- It is required to have a 4 foot square of ceiling around our sprinkler heads to allow for proper operation. If the sprinkler head is not at the top of the ceiling it will not heat up at the same rate as the rest of the room and not kick in soon enough.
- Sprinkler heads - bottom plate will melt off at 135 degrees F. That exposes an element that will melt at about 155 degrees F.
- in some cases locality may allow water fire suppression systems to be removed. This would likely require a backup system. Many times this is not allowed however either by the local statutes and/or building management.

Windows::Server Performance::Troubleshooting::Citrix


Troubleshooting Server Performance
The discussion of a specific issue below is perhaps useful in a more general sense for troubleshooting and performance monitoring topics.

Problem: After upgrading to Citrix Presentation Server 4.5 a higher average cpu utilization is observed as well as a high rate of context switches. Previously we have often received warnings in Citrix Performance Monitor for %interrupt -- this issue continues and is perhaps seen more often in 4.5 servers as well.

Background: Running PS4.5 using published applications and desktops on a Microsoft Windows 2003 SP2 server on a physical machine. Running several "high maintenance" accounting applications on two PS4.5 as published applications on virtual machines on VMWare Virtual Infrastructure 3.0 cluster. These all exhibit the symtoms above just since the upgrade to 4.5. Also, we are still running 4.0 on several other servers in the same Citrix Farm and various versions of PNA are in use by client machines (predominantly 8.x)

Investigation regarding context switches
A lot of good resources turned up:
Intel: Using Windows Performance Monitor
Sysinternals
www.thomaskoetzing.de
MSDN-Context Switches
Analyzing Processor Activity
Since this issue occurs on both physical and virtual servers it is not a VM problem, but will investigate this avenue as well to ensure correct and optimal configuration.
VMware: improving scalability for Citrix PS
http://redmondmag.com/features/article.asp?editorialsid=718


- definition: CPU's share their time between all threads according to priority. When the CPU stops working on one thread and starts working on another that is a context switch.
- monitoring: A ballpark rule of thumb is "normally" there should be no more than 28000 context switches per CPU on a system.
- What to look for
- Page file - too small, or is allowed to dynamically grow - recommendation: set to larger fixed size.
- Consider write cache on RAID controller
- insufficient hardware
- poorly designed device drivers or applications

Tools
- PerfMon - system/context switches
- SysInternals - Process Explorer - View > select columns > Process Performance > context switches, context switch delta
- pstat.exe (windows resource kit or support tools

VMWare
Some asides that came up during this investigation explained some issues we have had with virtualizing citrix servers. We needed to keep 2 cpu's in the VM after we converted them. That is the opposite of the VMWare recommendations we have seen.
- The multiprocessor HAL had not been downgraded to single processor HAL.
- Hidden devices in device manager had not all been removed.
1. Click Start, click Run, type cmd.exe, and then press ENTER.
2. Type set devmgr_show_nonpresent_devices=1, and then press ENTER.
3. Type Start DEVMGMT.MSC, and then press ENTER.
4. Click View, and then click Show Hidden Devices.
5. Expand the Network Adapters tree.
6. Right-click the dimmed network adapter, and then click Uninstall
uninstall any other physical devices not needed


Investigation
- Interesting - on the VM servers when looking at Task Manager the %cpu listed individually for all the processes for all users did not appear to add up to what was showing up on the Performance tab (at least 50% discrepency.) This was not observed on the physical server
- For both VM's and physical servers: Citrix Performance Monitor was showing warnings and intermittent error conditions on %cpu, %interrupt, context switches/sec.
- The VM's cpu utilization on the host machine is extremely high. On the server with the greatest number of users it maxed out the host cpu for much of the time I watched it.
- Watching performance monitor a few minutes showed context switches/sec to be in the hundreds of thousands.
- Opened Process Explorer and set view to show context switches and context switch deltas. I observed that at times it reported up to 50% cpu was due to hardware interrupts (this was not as dramatic when I checked it on the physical machine so I wonder if this is a reporting issue related to vmware's magic behind the scenes.) Also, the highest context switch delta was for hardware interrupts so Process Explorer was no help to further isolate it.
- To isolate what driver or program might be causing this issue, I piped the output of pstat.exe to a file and looked for the highest count of context switches. I took the memory address of that thread and looked it up in the bottom section to find what address range it fell in. In this case it was CDM.SYS
- google search of CDM.SYS turned up multiple articles about Citrix servers. I think CDM stands for Client Data Mapper. Of greatest interest is an article about a hotfix for PS4.5:
http://support.citrix.com/article/CTX114121 (and I see a lot of other post FR1 hotfixes out there too.)
The issue resolved in this hotfix is:
"Winlogon.exe shows higher than average CPU consumption on the server. The issue occurs because the server refreshes the smart card reader state more frequently than necessary. This occurs even if smart cards are not being used. With this fix, the reader state is refreshed only once per noticeable event."