1. You are viewing our forum as a guest. For full access please Register. WindowsBBS.com is completely free, paid for by advertisers and donations.

Bazaar rout problem

Discussion in 'Networking (Hardware & Software)' started by Pharo, 2007/11/09.

  1. 2007/11/09
    Pharo

    Pharo Inactive Thread Starter

    Joined:
    2007/10/27
    Messages:
    32
    Likes Received:
    0
    Good morning all,

    This is my first post and I have stumper at work. So let me start with the layout. I have several server serving 4 VLANS. I’ll call them 10.1. 10.2. 10.3 and 10.4. In addition to this I also have an isolated gig over copper switch in one of the server racks.

    The isolated switch doesn’t tie into the VLAN at all, it used for running backup jobs only. The backup server has a multi-terabyte disk array as well as a 24 tape carousel for this.

    Each server is dual honed, one card goes to the 10.x VLAN (public) and the other card goes to the backup (isolated) network of 192.168.0.xxx.

    This arrangement is working great except for one problem child my POS server. The POS server talks to several gateway servers which route traffic to ether another server or an AS/400. This is for credit card info or other in house account info.

    These requests are then sent back to the POS server with ether a yes or no response through ether the same gateway or another gateway depending on the request.

    For example a credit card request would go out through a serial to Ethernet converter (lame I know but I didn't write the program) and on to the AS/400. The AS/400 would process the request and send it back to the POS server through the same path.

    This part works great. However another type of request might use the (public) NIC to send a request to a gateway which would send to another server which after verify the customer data and send a yes or no response back to the POS server through yet another gateway. In this arrangement one gateway sends the request and another one handles the response. Before you ask, I don’t know why the programs were written this way, I just have to deal with it.

    It’s the second kind of request that is causing me grief. What happens is if the gateway returning the request is restarted while the second NIC (192.168.0.X) in the POS server is configured and enabled, the gateway will assume that is the return IP address and request start timing out. As long as I don’t restart the gateway, it works correctly. But when changes are made gateways have to get restarted.

    This is blowing my mind because the gateway servers are not dual honed or on the backup (192.168.x.x) network. The data never changes on those servers so there is no need to back them up.

    All IP address in this arrangement are static. Sub netting is as follows;

    Public
    IP 10.1.xxx.xxx
    Subnet 255.255.0.0
    Gateway 10.1.x.x

    Isolated
    IP 192.168.0.X
    Subnet 255.255.255.0
    No gateway

    All server involved are SQL if that makes a difference and the only gateway this happens on is the one returning the request.

    I route the back jobs to the backup server from the servers getting backed up by using a routs listed in the backup server's host file. But I don’t understand how this one gateway is picking up the wrong IP address.

    If anyone can solve this I will gladly post a reply that your Kung fu is stronger then mine. :D

    Thank you all in advance
     
  2. 2007/11/09
    blunam

    blunam Inactive

    Joined:
    2007/11/09
    Messages:
    20
    Likes Received:
    0
    Wow, you have your work cut out there. that is a complicated one to follow in a posting. The good news is that such issues tend to be really simple, the trick is to identify the correct problem. First you to know the ture routes of your packets
    cmd prompt type tracert ipaddress that will return the path your packs are taking. You need to do this when it is both working and not working. that will show you the difference and allow you to know where this is going wrong.

    Note: Tracert will return the interface of a router that the pack is arriving at and not the interface it leaves from.

    Once you have identified at want point the pack goes wrong you need to check the routing table on that device (in windows taht command at the cmd prompt is route print). the information from that should lead you in the right direction.

    Remeber routers and servers can pick up a router though protocls like RIP and osp. So your problem may just be that when that interface goes down the system RIPs out a new Route.

    Hope that helps

    Bill
    www.kinetics.co.nz
     

  3. to hide this advert.

  4. 2007/11/09
    Pharo

    Pharo Inactive Thread Starter

    Joined:
    2007/10/27
    Messages:
    32
    Likes Received:
    0
    Well that was the first thing I thought of while I was having the issue so I did a ping by name from gateway and it returned the correct address. That let DNS off the hook. The OS of the gateway server knows the correct IP address. It's the gateway software the gets confused.

    Tracert showed the correct path as well both by name and IP.

    Running a tracert to the 192.168 IP times out because the gateway server is not on the backup (isolated) network so there is no way it can get to it.

    I can run a program called Comscr.exe from the gateway that shows the path it's trying to send the data to. That's what tiped off when it showed me the 192.168 address it was trying to hit.

    The gateway app is a SQL app and runs as a service on the gateway server. I've opened a case with the vendor on this but support from that vendor has never been all that good.

    I thought about putting an entry it the gateway servers host file but figure if the OS (Windows Server 2000) knows the correct path already it won't make a difference.

    I also disable the service on the backup server that would allow it to become the master browser. But this didn't help ether.

    I think this is more of a SQL issue but what stumps me is the other SQL gateways are not affected.

    However I'll have to resolve this if I want to run my backup's over a isolated network. It just takes up too much bandwidth to run backup over the public network. But if I have too, I run that backup later at night. All the other backups run just fine.

    The one server I expected to have issues doing (MS Exchange) this truned out to be a silver star winner. Go figure.

    But I agree with you thet it will probibly end up being something simple.

    Thanks again,

    PBiZ
     
  5. 2007/11/09
    blunam

    blunam Inactive

    Joined:
    2007/11/09
    Messages:
    20
    Likes Received:
    0
    i'm on pda right now, but have tried reading this again. I think you are saying that it routes back out though the wrong nic?

    In theroy the sql itself does not decide the path. Nowever if it some how binds to the wrong nic it may choose that nic. Route print will show the weighting of possible routes & that maybe a clue. I will later tonight check a sql enterprise manager to c if they can have any ip settings. I assum its full sql & not express or msde

    bill
    http://www.kinetics.co.nz
     
  6. 2007/11/09
    Pharo

    Pharo Inactive Thread Starter

    Joined:
    2007/10/27
    Messages:
    32
    Likes Received:
    0
    Yes the wrong NIC on the POS server, the gateway server is not daul honed and is only on the public network. It can't even see the second nic's IP address because the gateway server is not on the backup (isolated) network.
     
  7. 2007/11/09
    Pharo

    Pharo Inactive Thread Starter

    Joined:
    2007/10/27
    Messages:
    32
    Likes Received:
    0
    Perhaps I need to clarify a few things in the layout.

    The POS server is dual honed to both the public (10.1) and isolated (192.168) network. The isolated network (192.168) is for running backup jobs only and can't be seen from the public network. This way, the bandwidth of the public network is never impacted by the backup jobs.

    The gateway servers are not daul honed and are only on the public network. The gateway servers can't even talk to the isolated network (192.168) only the public network (10.1)

    The backup server is daul honed. It can talk to both networks (10.1 and 192.168). It pulls data from some of the other servers (POS server) over the isolated network (192.168) during the backup job runs. The only reason it's on the public network (10.1) is so we can remote into it from the office.

    The gateway server has to be picking up the IP address (from the NIC on the isolated network 192.168) from ether the POS or backup server. But it's not the gateway server's OS that's doing it because a ping by name from that server always returns the correct IP address. It's only the gateway app that picks up the isolated (192.168) NIC's IP Address.

    The stream I can read from Comscr looks something like this.


    *****************************************
    M1 PMS request status (Status_Approved)
    M1
    M1
    M1 IP 192.168.0.X
    M1
    M1 Terminal ID 17
    ******************************************

    There is other junk in the stream but this is the important text that gives away the error. The IP address that should be in this stream (when it's working correctly) is 10.1.xxx.xxx.

    But even when this error is happening, I can still ping the POS server by name or IP from the gateway server. If I ping it by name, it returns the correct IP (10.1.xxx.xxx).

    Thanks again,

    PBiZ
     
    Last edited: 2007/11/10
  8. 2007/11/10
    blunam

    blunam Inactive

    Joined:
    2007/11/09
    Messages:
    20
    Likes Received:
    0
    With out getting into packet sniffers etc to capture the packet on each hop. This is what my logic is tellng me right now. Is there just the 1 hop between the two servers?

    Option 1. Although the gateway server is not connected to the backup network the source and destination servers are. When the gateway goes down the the two servers (or another gateway) alerts the network to the loss of the gateway. Because the two server know they have an alertnative route (the backup network) they start to use that. Try disabling RIP listening on those to servers. Did you do a Tracert (not a ping) whilst the issue was happening? You will most likley need to do the Route print during the issue as well. Its also possible that its another gateway that is sending out the incorrect route info when it senses that gateway down.

    Option 2. the destenation server is in your DNS/WINs or a host/lmhost file twice, once with the 10 address and once with the 192 address. So long the
    10 address is available that is used when it is not the 192 is used. I have seen a server with 2 IP addresses on the 1 nic a ping -a will always return the first address but if i reboot that server the DNS server will return the 2ndry address on the ping -a. Don't ask me why.
     
  9. 2007/11/10
    blunam

    blunam Inactive

    Joined:
    2007/11/09
    Messages:
    20
    Likes Received:
    0
    been reading though this again and it really doe sound like a RIP or OSFP issue. When the gateway goes done something detects that it is know longer available and and RIPs out a altnative Route though the 192 network. You have refeered to them as gateways and the impression I have are that the gateways are servers rather than hardware routers. But you also tlaked about Vlans and typically that does mean hardware devices and its these hardware devices that are most likly to be pushing this info out. You may need to ask your support for them to come in and replcate the issue while they are there.
     
  10. 2007/11/10
    Pharo

    Pharo Inactive Thread Starter

    Joined:
    2007/10/27
    Messages:
    32
    Likes Received:
    0
    Option 1 is no good on the POS server because I have cash registers on 2 different subnets. Yes there is only 1 hop between the the gateway and POS server.

    Option 2 looks promising but after I delete the 192 entry how do I keep DNS from just putting it back in?

    Oh well, thanks for the help.

    PBiZ
     
  11. 2007/11/10
    Pharo

    Pharo Inactive Thread Starter

    Joined:
    2007/10/27
    Messages:
    32
    Likes Received:
    0
    Well, when I say "gateway" I don't mean a router or level 3 switch. I mean a server running a SQL gateway app or bridge. These gateways only route data directed at them from one host or another.

    They only handle specific data like credit card info, hotel room charges or in house account data. They certainly don’t route all network traffic. I have routers and level 3 switches for that.

    The VLAN I speak of is handled by a Nortel Passport and does not do anything with the backup network. The backup switch is in the server rack with the servers. The Passport is in the hub room about 150’ down the hall and nothing connects the two switches. If I telnet into the Passport and do a ping to 192.168.x.x it will only return a time out.

    It’s not my switching causing this.


    Later,

    PBiZ
     
  12. 2007/11/11
    blunam

    blunam Inactive

    Joined:
    2007/11/09
    Messages:
    20
    Likes Received:
    0
    Option 2
    On the network connection for that server on that address
    go into its properties, then go in to TCO/IP then choose the advanced button then the DNS tab. 2nd tick box from the ottom, untick "register this connections address in DNS ". That should stop it updating the DND with the servers name on that IP address. Make sure its no in WINS as well, the only way to stop it updating WINS is to not have WINs set on thta IP address.
     
  13. 2007/11/11
    ReggieB

    ReggieB Inactive Alumni

    Joined:
    2004/05/12
    Messages:
    2,786
    Likes Received:
    2
    I think one problem is that you are using HOST file entries where you should be using static routes. For example, if the gateway is on the public side of the POS system and only has a 10.0.0.0 address (no 192.168.0.0 address), adding a HOST files entry for 192.168.0.0 will only tell the system what name to map to the 192.168.0.0 address. It will not tell the system how to get a packet to that address. As the gateway is not directly attached to the 192.168.0.0 network, it will send packets to it via it's default gateway, unless you set up a static route.

    Have a look at this technet article:

    http://technet2.microsoft.com/windo...99b7-4685-9542-24337b5deb401033.mspx?mfr=true
     
  14. 2007/11/11
    Pharo

    Pharo Inactive Thread Starter

    Joined:
    2007/10/27
    Messages:
    32
    Likes Received:
    0
    I did that when I set it up however I have noticed some strange quirk to this. When I uncheck this option for the one card, it also unchecks in on the other. I noticed this doesn't happen on a 2003 server so maybe it's a 2000 limitation?

    TIA,

    PBiZ
     
  15. 2007/11/11
    Pharo

    Pharo Inactive Thread Starter

    Joined:
    2007/10/27
    Messages:
    32
    Likes Received:
    0
    Interesting, but I'm beginning to wonder if it would just be easier to put a second NIC in the gateway and plug it into the backup network. Then it would find the route no matter what. But this seems kind of a waste as I would never backup the gateways.

    Thanks for the reply.

    PBiZ
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.